JC: General comments: Some answers missing, especially for the rdf section, and a few mistakes. Try to add a bit more detail and background theory into your explanations and make sure that your results make sense and are what you would expect.

Boglarka's Computational Lab: Molecular dynamics simulations of simple liquids

Introduction & Running the first simulation

In this section,some trial simulations were run with various time steps: 0.001, 0.0025, 0.0075, 0.01 and 0.015, through the High Performance Computing (HPC) systems. The output of these calculations carried out with the LAMMPS software is presented an discussed in the section below.

Introduction to molecular dynamics simulation

In this section of the experiment, the calculations carried out in the Introduction section are presented and processed, after investigating some of the background Theory.

TASK 1: "Open the file HO.xls. In it, the velocity-Verlet algorithm is used to model the behaviour of a classical harmonic oscillator. Complete the three columns "ANALYTICAL", "ERROR", and "ENERGY": "ANALYTICAL" should contain the value of the classical solution for the position at time $t$ , "ERROR" should contain the absolute difference between "ANALYTICAL" and the velocity-Verlet solution (i.e. ERROR should always be positive -- make sure you leave the half step rows blank!), and "ENERGY" should contain the total energy of the oscillator for the velocity-Verlet solution. Remember that the position of a classical harmonic oscillator is given by $x (t) = A \cos (ω t + ϕ)$ (the values of $A$ , $ω$ , and $ϕ$ are worked out for you in the sheet)."

In this task, the absolute error between the analytical solution to the position of the atom (given by the Classical Particle Approximation in the task, and completed for all time-steps with a given x(t) ) and the solution obtained through the use of the Velocity-Verlet Algorithm was calculated.

The plots below show the graphical output of the calculations of the position of the atom using both the Classical Particle Approximation and the the total Energy calculated from the value of v(t) from the Velocity-Verlet Algorithm, as given by E = 0.5m(v(t)^2)).

Plot to show comparison of velocity-verlet and analytical calculations	Plot to show Energy as a function of Time

TASK 2: For the default timestep value, 0.1, estimate the positions of the maxima in the ERROR column as a function of time. Make a plot showing these values as a function of time, and fit an appropriate function to the data.

As shown in the plot above, the maxima are at time 2, 4.8, 8 and 11.1. The function fitted to the Energy plot was done using Excel software was a combination of logarithmic, power and moving average, while the line of best fit for the maxima was determined to be linear, with the equation displayed in the plot above.

As can be seen from the plot, the error becomes more significant for each oscillation, due to the fact that each calculation uses the result obtained from the previous one.

JC: Why does the error oscillate?

TASK 3: Experiment with different values of the timestep. What sort of a timestep do you need to use to ensure that the total energy does not change by more than 1% over the course of your "simulation"? Why do you think it is important to monitor the total energy of a physical system when modelling its behaviour numerically?

To ensure that the energy does not change by more than 1% over the course of the simulation, the time step should be 0.3.

This was determined through a trial and error approach with the above value corresponding to a variation of 1%.

Ideally, the largest possible time step would be used in Molecular Dynamics Studies in order to stimulate for the longest possible time, though the smaller the time step, the smaller the error due to fluctuations. The balance is struck on the basis that small time steps increase the time taken for the calculations to be run (in more complex cases than this task of course).

It is of great importance to monitor the total energy of the physical system when modelling the behaviour numerically because we aim to be at equilibrium: significant fluctuations in energy would demonstrate this wasn't the case.

JC: Energy should be conserved so the total energy should be as constant as possible.

TASK 4: For a single Lennard-Jones interaction, $ϕ (r) = 4 ϵ (\frac{σ^{12}}{r^{12}} - \frac{σ^{6}}{r^{6}})$ , find the separation, $r_{0}$ , at which the potential energy is zero. What is the force at this separation? Find the equilibrium separation, $r_{e q}$ , and work out the well depth ( $ϕ (r_{e q})$ ). Evaluate the integrals $\int_{2 σ}^{\infty} ϕ (r) d r$ , $\int_{2.5 σ}^{\infty} ϕ (r) d r$ , and $\int_{3 σ}^{\infty} ϕ (r) d r$ when $σ = ϵ = 1.0$ .

JC: The energy at r0 is zero, but the force is the derivative or gradient of the potential energy and is not zero at r0. All other maths correct.

TASK 5: Estimate the number of water molecules in 1ml of water under standard conditions. Estimate the volume of $10000$ water molecules under standard conditions.

TASK 6: Consider an atom at position $(0.5, 0.5, 0.5)$ in a cubic simulation box which runs from $(0, 0, 0)$ to $(1, 1, 1)$ . In a single timestep, it moves along the vector $(0.7, 0.6, 0.2)$ . At what point does it end up, after the periodic boundary conditions have been applied?.

The atom would end up at (0.2, 0.1, 0.7) after applying the periodic boundary conditions.

TASK 7: The Lennard-Jones parameters for argon are $σ = 0.34 n m, ϵ / k_{B} = 120 K$ . If the LJ cutoff is $r^{*} = 3.2$ , what is it in real units? What is the well depth in ${k J m o l}^{- 1}$ ? What is the reduced temperature $T^{*} = 1.5$ in real units?

JC: Correct.

Equilibration

TASK 8: Why do you think giving atoms random starting coordinates causes problems in simulations? Hint: what happens if two atoms happen to be generated close together?

Giving atoms random starting coordinates causes problems in simulations because there is the possibility that bonding interactions could occur, should two atoms happen to be generated close together.

This can be prevented by placing the atoms on lattice points of a simple cubic lattice.

JC: The Lennard Jones potential only accounts for intermolecular forces and so atoms cannot form bonds. If atoms are placed very close together the high repulsive forces can make the simulation crash.

TASK 9: Satisfy yourself that this lattice spacing corresponds to a number density of lattice points of $0.8$ . Consider instead a face-centred cubic lattice with a lattice point number density of 1.2. What is the side length of the cubic unit cell?

Number Density^[1] is defined as the number of specified objects per unit volume, and as such for a density of 0.8 and 2 atoms per body centred cubic lattice, this gives a volume of 2.5 reduced units per unit cell. This corresponds to a lattice spacing of 1.07722 as related by its Atomic packing factor^[2] .

While for a face-centred cubic lattice with lattice point number density of 1.2 the side length of the unit cell is 1.063, obtained by taking the cube root of the density.

TASK 10: Consider again the face-centred cubic lattice from the previous task. How many atoms would be created by the create_atoms command if you had defined that lattice instead?

The face centred cubic lattice has 4 lattice points (here, atoms) per unit cell, and therefore creating 1000 fcc unit cells generates 4000 lattice points (atoms).

TASK 11: Using the LAMMPS manual, find the purpose of the of the following commands in the input script:

mass 1 1.0
pair_style lj/cut 3.0
pair_coeff * * 1.0 1.0

Line 1: mass I value

I defines the atom type while value allows us to input the mass

Line 2: pair_style style args

with style as lj/cut, this indicates the standard 12/6 Lennard Jones potential

while 3.0 shows the list of arguments, here a 3.0 (reduced units) cut-off distance

Line 3: pair_coeff I J args specifies the the pairwise standard forcefiled coefficients for one or more pairs of atom types, with * indicating 1 to N

JC: What are the forcefield coefficients for the Lennard Jones forcefield? Why do we use a cutoff?

TASK 12: Given that we are specifying $x_{i} (0)$ and $v_{i} (0)$ , which integration algorithm are we going to use?

We will be using the Velocity-Verlet Integration algorithm beacuse we have been given the initial position and velocity.

TASK 13: Look at the lines below.

### SPECIFY TIMESTEP ###
variable timestep equal 0.001
variable n_steps equal floor(100/${timestep})
variable n_steps equal floor(100/0.001)
timestep ${timestep}
timestep 0.001

### RUN SIMULATION ###
run ${n_steps}
run 100000

The second line (starting "variable timestep...") tells LAMMPS that if it encounters the text ${timestep} on a subsequent line, it should replace it by the value given. In this case, the value ${timestep} is always replaced by 0.001. In light of this, what do you think the purpose of these lines is? Why not just write:

timestep 0.001
run 100000

It is much better practice, when writing code, to define a value as a variable (e.g. 1000 as the timestep). This is because this then allows for the quicker and easier modification of this value, and without it there could be the risk of errors forgetting to change one of the values further down in the code.

TASK 14: make plots of the energy, temperature, and pressure, against time for the 0.001 timestep experiment (attach a picture to your report). Does the simulation reach equilibrium? How long does this take? When you have done this, make a single plot which shows the energy versus time for all of the timesteps (again, attach a picture to your report). Choosing a timestep is a balancing act: the shorter the timestep, the more accurately the results of your simulation will reflect the physical reality; short timesteps, however, mean that the same number of simulation steps cover a shorter amount of actual time, and this is very unhelpful if the process you want to study requires observation over a long time. Of the five timesteps that you used, which is the largest to give acceptable results? Which one of the five is a particularly bad choice? Why?

These graphs above show the energy, temperature and pressure against time for the 0.001 timestep experiment. The simulation does reach equilibrium in under 1 reduced time units , as shown by the values fluctuating around a mean value for each graph after this. Below, the graph to show energy against time at all steps is shown.

From this, it can be seen that the largest timestep to give acceptable results is 0.01 because although the mean around which the fluctuation occurs is slightly higher than the other, smaller, timesteps, it is still linear, and does not increase over time, i.e. it has equilibriated. This is what makes the 0.015 timestep a particularly bad choice: the energy does not stay constant over time.

JC: The average total energy should not depend on the choice of timestep as it does for 0.0075 and 0.01. Both 0.0025 and 0.001 give the same average total energy, so the best choice is the larger of these two values - 0.0025.

Running simulations under specific conditions

TASK 15: Choose 5 temperatures (above the critical temperature $T^{*} = 1.5$ ), and two pressures (you can get a good idea of what a reasonable pressure is in Lennard-Jones units by looking at the average pressure of your simulations from the last section). This gives ten phase points — five temperatures at each pressure. Create 10 copies of npt.in, and modify each to run a simulation at one of your chosen $(p, T)$ points. You should be able to use the results of the previous section to choose a timestep. Submit these ten jobs to the HPC portal. While you wait for them to finish, you should read the next section.

TASK 16: We need to choose $γ$ so that the temperature is correct $T = 𝔗$ if we multiply every velocity $γ$ . We can write two equations:

$E_{K} = \frac{3}{2} N k_{B} T$

$\frac{1}{2} \sum_{i} m_{i} v_{i}^{2} = \frac{3}{2} N k_{B} T$

Solve these to determine $γ$ .

The solution for the task is provided below, making use of the fact that gamma is a constant.

JC: Correct.

TASK 17: Use the manual page to find out the importance of the three numbers 100 1000 100000. How often will values of the temperature, etc., be sampled for the average? How many measurements contribute to the average? Looking to the following line, how much time will you simulate?

100 = N every = use input values every 100 timesteps

1000 = N repeat = the number of times to use input values for calculating averages

100000 = N frequency = calculate the average this frequently

So running 100,000 atoms, taking input values every 100 timesteps, 1000 will be simulated.

JC: What does this mean?

TASK 18: When your simulations have finished, download the log files as before. At the end of the log file, LAMMPS will output the values and errors for the pressure, temperature, and density $(\frac{N}{V})$ . Use software of your choice to plot the density as a function of temperature for both of the pressures that you simulated. Your graph(s) should include error bars in both the x and y directions. You should also include a line corresponding to the density predicted by the ideal gas law at that pressure. Is your simulated density lower or higher? Justify this. Does the discrepancy increase or decrease with pressure?

The simulated density is at lower height than that shown by the Ideal Gas Equation. This is because the Ideal Gas Equation does not take into account any intermolecular forces. For the Lennard-Jones atoms, there is repulsion between them, so they are packed less tightly than that of the Ideal Gas.

At higher pressures, the atoms are forced to be even closer: in the LJ simulation, there isn't as much change in density as for an ideal gas, because the repulsions present between atoms prevent the increase to a great extent. In contrast, the atoms in the ideal gas can rearrange themselves to be more tightly packed, creating the increase in discrepancy.

JC: Correct, how does the discrepancy between simulation and ideal gas results change with temperature and pressure?

Calculating heat capacities using statistical physics

TASK 19: As in the last section, you need to run simulations at ten phase points. In this section, we will be in density-temperature $(ρ^{*}, T^{*})$ phase space, rather than pressure-temperature phase space. The two densities required at $0.2$ and $0.8$ , and the temperature range is $2.0, 2.2, 2.4, 2.6, 2.8$ . Plot $C_{V} / V$ as a function of temperature, where $V$ is the volume of the simulation cell, for both of your densities (on the same graph). Is the trend the one you would expect? Attach an example of one of your input scripts to your report.

The link to the log script is here

The trend can be seen to be that the higher the density, the greater the heat capacity at each of the temperatures.

Increasing the density increases the number of particles per unit volume, and so this greater number of particles needs more energy to be thermally excited (i.e. a higher heat capacity).

JC: Good, did you have any ideas about the trend in heat capacity with temperature?

Structural properties and the radial distribution function

TASK 20: perform simulations of the Lennard-Jones system in the three phases. When each is complete, download the trajectory and calculate $g (r)$ and $\int g (r) d r$ . Plot the RDFs for the three systems on the same axes, and attach a copy to your report. Discuss qualitatively the differences between the three RDFs, and what this tells you about the structure of the system in each phase. In the solid case, illustrate which lattice sites the first three peaks correspond to. What is the lattice spacing? What is the coordination number for each of the first three peaks?

g(r) is the radial distribution function, and in the graph above this is measured for a gas, liquid and solid simulation. It shows the structure of a system, being a measure of the probability of finding a particle at a distance of r from a reference particle (relative to an ideal gas particle). Qualitatively, it can be seen that the solid has the greatest number of neighbours: it's separation and peak heights are characteristic of its ordered lattice structure, the fcc lattice. This is expected because it is the most densely packed, and has the most collisions with other atoms. Liquid, as expected, has a more greatly fluctuating RDF than gas due to its denser structure, but less so than that of a solid.

The integral of g(r) shows the degree of order in each of the phases: with the values confirming: solid, liquid, gas in decreasing atomic order.

Each peak corresponds to a coordination shell. The first three peaks in the fcc solid case correspond to

1. the atom in the face centre from the reference atom placed arbitrarily in the corner of the unit cell

2. the two adjacent corners (longer, so at greater separation value)

3. the opposite face centred atom (having the greatest separation).

The coordination number of the lattice is the number of nearest neighbours. For fcc, this is 12^[3], and the simulation shows this also, by numerical integration under the first peak (the first coordination shell).

JC: The solid g(r) has peaks at large r indicating long range order, the liquid has peaks at short distance indicating some short range order, but no long range order. How did you calculate the number of atoms responsible for each peak, try zooming in to the integral of g(r) to see what the integral of the first 3 peaks is. Did you calculate the lattice parameter?

Dynamical properties and the diffusion coefficient

TASK 21: In the D subfolder, there is a file liq.in that will run a simulation at specified density and temperature to calculate the mean squared displacement and velocity autocorrelation function of your system. Run one of these simulations for a vapour, liquid, and solid. You have also been given some simulated data from much larger systems (approximately one million atoms). You will need these files later.

In this section, simulations have been run to calculate the mean squared displacement and velocity auto-correlation function of the system at vapour, liquid and solid.

TASK 22: make a plot for each of your simulations (solid, liquid, and gas), showing the mean squared displacement (the "total" MSD) as a function of timestep. Are these as you would expect? Estimate $D$ in each case. Be careful with the units! Repeat this procedure for the MSD data that you were given from the one million atom simulations.

Calculations Carried Out Using HPC Portal, fcc 0.8 density	Observations	Estimate of D (in reduced units)	Calculations carried out from 1 million atom simulations (provided)	Estimate of D (in reduced units) for 1 million atoms
Gas Mean Squared Displacement Plot	It can be seen that we need a longer time to capture the long-time diffusive behaviour of a a gas as its mean free path is longer than of the liquid. Here it is initially under a ballistic regime,with no collisions taking place, due to the gas atoms being far from each other.	1.44E-2	Gas Mean Squared Displacement Plot for 1 million atoms	6.03E-3
Liquid Mean Squared Displacement Plot	MSD is shown to grow linearly with time.	1.67E-4	Liquid Mean Squared Displacement Plot for 1 million atoms	1.67E-4
Solid Mean Squared Displacement Plot	The mean free path for a solid is very short: the diffusion is expected to be very small.	6.66E-9	Solid Mean Squared Displacement Plot for 1 million atoms	1.67E-8

JC: Correct.

TASK 23: In the theoretical section at the beginning, the equation for the evolution of the position of a 1D harmonic oscillator as a function of time was given. Using this, evaluate the normalised velocity autocorrelation function for a 1D harmonic oscillator (it is analytic!):

$C (τ) = \frac{\int_{- \infty}^{\infty} v (t) v (t + τ) d t}{\int_{- \infty}^{\infty} v^{2} (t) d t}$

Be sure to show your working in your writeup. On the same graph, with x range 0 to 500, plot $C (τ)$ with $ω = 1 / 2 π$ and the VACFs from your liquid and solid simulations. What do the minima in the VACFs for the liquid and solid system represent? Discuss the origin of the differences between the liquid and solid VACFs. The harmonic oscillator VACF is very different to the Lennard Jones solid and liquid. Why is this? Attach a copy of your plot to your writeup.'

JC: The integration can be avoided by recognising the numerator as a mixture of odd and even functions and canceling the integrals accordingly.

The plot on the left shows the VACF for Liquid, Solid and Harmonic Oscillator. As the velocity auto-correlation function (VACF) is averaged over time and atoms, the plot for solid and liquid can be seen to be highest at τ = 0 because at this time velocity is v(t)² which has greater value than <v(t) * v(t + τ)> where the velocity at the later time (t + τ) would be different to that at time t and thus leads to a decrease in C(τ).

The minima in the VACF results from the collisions between atoms occuring in the solid and liquid, modelling the oscillation the molecule undergoes in its lattice for example (for the solid).

The liquid and solid VACFs have differences due to the liquid being a less ordered collection of atoms than a solid.

The HO VASF is very different because it doesn't take into account the intermolecular interactions that the Lennard-Jones liquid and solid have, which has the effect of randomising their velocities (and thus the mean is fluctuating around, and eventually approaches zero - the positive and negative velocities have the net effect of zero). Therefore the HO has a constantly oscillating VASF.

TASK 24: Use the trapezium rule to approximate the integral under the velocity autocorrelation function for the solid, liquid, and gas, and use these values to estimate $D$ in each case. You should make a plot of the running integral in each case. Are they as you expect? Repeat this procedure for the VACF data that you were given from the one million atom simulations. What do you think is the largest source of error in your estimates of D from the VACF?

The Trapezoid Rule is used for the approximation of the integrals over the timesteps used in the simulation.

The integral under the VACF gives the diffusion constant D because the displacement of a particle over an interval t can be obtained by integrating its velocity.

The following plots show the VACF for solid, liquid and gas both for the simulations run and for the 1 million atom simulations, as well as the running integral plots for 1 million atoms.

VACF	VACF for 1 million atoms	Plot of running Integral for 1 million atoms (as approximated by the Trapezium Rule)
Liquid
Solid
Gas

JC: Why does the integral of the solid VACF increase rather than flattening out, and why does it give such a large diffusion coefficient? The plots of the VACF look good, so maybe there is a mistake in the calculation of the running integral?

The diffusion coefficient, D is given by

and the values are tabulated below

			approximate D value as obtained by the Trapezium Rule
Gas			1.52E03
Liquid			4.50E01
Solid			4.00E07

With the great difference in D for the solid due to the nature of the approximation: there is great over-estimation of the area at each step through the use of the trapezium rule.

Discussion of the VACF graphs:

The plots to show the velocity auto-correlation functions show how it varies over time: the gas is observed to vary differently from the liquid or solid due to its much greater molecular disorder: it does not have the minima like the solid and liquid since collisions don't occur as frequently.

References:

↑ https://en.wikipedia.org/wiki/Number_density
↑ https://en.wikipedia.org/wiki/Atomic_packing_factor
↑ https://en.wikipedia.org/wiki/Coordination_number

[1] ttps://en.wikipedia.org/wiki/Number_density

[2] ttps://en.wikipedia.org/wiki/Atomic_packing_factor

[3] ttps://en.wikipedia.org/wiki/Coordination_number

[1]

[2]

[3]