Talk:Mod:ek1813
3rd Year: Computational Lab-Liquid Simulations
This lab introduced us to Molecular Dynamics simulations using LAMMPS computational software. Various ensembles and timesteps were used in order to study the effects of system equilibration, variation of density with temperature, heat capacities, the radial distribution function and how it relates to the structural properties of the system and finally diffusion. Additionally, the visualization software VMD was used to visualize some of the data that was produced. 06/11/2015 Elisabeth Koninx
Introduction to Molecular Dynamics Simulation
TASK: Open the file HO.xls. In it, the velocity-Verlet algorithm is used to model the behaviour of a classical harmonic oscillator. Complete the three columns "ANALYTICAL", "ERROR", and "ENERGY": "ANALYTICAL" should contain the value of the classical solution for the position at time , "ERROR" should contain theabsolute difference between "ANALYTICAL" and the velocity-Verlet solution (i.e. ERROR should always be positive -- make sure you leave the half step rows blank!), and "ENERGY" should contain the total energy of the oscillator for the velocity-Verlet solution. Remember that the position of a classical harmonic oscillator is given by
The following values for the constants were given in the data sheet:
In order to solve for the analytical solution of , the classical harmonic oscillator equation was used from above. For example, at time 0.00 the analytical solution is:
To calculate the error the absolute difference was calculated between the solutions to the classical harmonic oscillator and the velocity-Verlet solution. The largest error was on the order of magnitude of which indicates that the classical harmonic oscillator is a good approximation of the velocity-Verlet equation.
Finally the total energy is the sum of the kinetic and potential energy. This was given by where and were solutions of the velocity-Verlet equation.
The following graphs were produced:
TASK: For the default timestep value, 0.1, estimate the positions of the maxima in the ERROR column as a function of time. Make a plot showing these values as a function of time, and fit an appropriate function to the data.
From Figure 2, the maximum error values were found. This was then plotted and a line of best fit was fitted to the data. The equation for the line is and has a value of which indicates that the line is an excellent fit to the data. We can therefore extrapolate that as the simulation continues at that timestep, the error between the analytical harmonic oscillator and the Velocity-Verlet equation will increase linearly. This graph can be seen below:
TASK: Experiment with different values of the timestep. What sort of a timestep do you need to use to ensure that the total energy does not change by more than 1% over the course of your "simulation"? Why do you think it is important to monitor the total energy of a physical system when modelling its behaviour numerically?
If we restrict the total energy such that it does not change more than 1% of then the energy can only oscillate between . Increasing the time step to 0.2 ensures that this constraint is met as can be seen in the figure below.
NJ: Be careful - . You mean
It is important to monitor the total energy of a system to ensure that it stays constant as that is one of the constraints that you have placed on the system. If the simulation breaks this constraint then the simulation loses its meaning.
NJ: AN=nd physically, we know that energy has to be conserved.
TASK: For a single Lennard-Jones interaction, , find the separation, , at which the potential energy is zero. What is the force at this separation? Find the equilibrium separation, , and work out the well depth (). Evaluate the integrals , , and when .
On a Lennard Jones potential graph is the value of where the potential is equal to zero. Therefore
Solving this equation gives:
In order to find the force at this value of
we need to use the classical physics equation of:
As there is only a single Lennard Jones interaction, this means that the value of in the above equation is equal to 1, so the equation reduces to
To find the force at the separation we substitute into the above equation.
The equilibrium separation, is the value of at the minimum in the Lennard Jones potential. At this point the force is equal to zero. Hence:
In order to find the well depth the value needs to be used in the Lennard Jones interaction potential equation.
Finally the integrals can be evaluated knowing that
The other integrals follow the same set up:
NJ: The final two are a bit too small, double check your working.
TASK: Estimate the number of water molecules in 1ml of water under standard conditions. Estimate the volume of water molecules under standard conditions.
to answer this question I calculated the number of moles in 1 mL of water and then multiplied by Avogadros Number:
To estimate the volume of 1000 water molecules we work through the same process but backwards:
TASK: Consider an atom at position in a cubic simulation box which runs from to . In a single timestep, it moves along the vector . At what point does it end up, after the periodic boundary conditions have been applied?.
Applying periodic boundary conditions simply means that when the particle moves outside the simulation box to its final position, it is actually displaced back in its final position in the original simulation box. Therefore this particle will end up in the position (0.2, 0.2, 0.7)
NJ: (0.2, 0.1, 0.7)
TASK: The Lennard-Jones parameters for argon are . If the LJ cutoff is , what is it in real units? What is the well depth in ? What is the reduced temperature in real units?
We can find the answer to these questions through simple rearrangements of the following equations:
- distance
- energy
- temperature
NJ: I think you've missed a 1000 somewhere, it should be about 0.998 kJmol^-1
Equilibrations
TASK: Why do you think giving atoms random starting coordinates causes problems in simulations?
If atoms were given random starting coordinates then there is a risk that the atoms may be placed too close together or placed directly on top of each other. This would cause problems within the simulation as it would distort the interaction potentials and the simulation software would unlikely be able to apply the parameters which would cause errors to occur.
TASK: Satisfy yourself that this lattice spacing corresponds to a number density of lattice points of . Consider instead a face-centred cubic lattice with a lattice point number density of 1.2. What is the side length of the cubic unit cell?
The number density, number of lattice points and distance between lattice points are all related through the following equation:
Therefore for the simple cubic lattice that has one lattice point within its unit cell and a distance of 1.07722 between lattice points the number density can be calculated as:
The face centred cubic lattice has four lattice points, and if it has a number density of then the distance between lattice points can be calculated:
TASK: Consider again the face-centred cubic lattice from the previous task. How many atoms would be created by the create_atoms command if you had defined that lattice instead?
The face centred cubic lattice contains four lattice points in a single unit cell. Therefore, 4000 atoms would be created in our simulation box.
TASK: Using the LAMMPS manual, find the purpose of the following commands in the input script:
mass 1 1.0 pair_style lj/cut 3.0 pair_coeff * * 1.0 1.0
mass 1 1.0
This part of the code tells the LAMMPS simulation software to assign all atoms of type 1 to a mass of 1.0.
pair_style lj/cut 3.0
This part of the code tells the LAMMPS simulation software to use a Lennard-Jones potential with no Coulombic interactions and the LJ cutoff is 3.0.
pair_coeff * * 1.0 1.0
This part of the code tells the LAMMPS simulation software to set the coefficients for the symmetric J, I interactions for all the atoms 1 through N to have the same value of 1.0.
NJ: What are the coefficients in this case?
TASK: Given that we are specifying and , which integration algorithm are we going to use?
The integration algorithm which will be used is the velocity-Verlet algorithm as and are specified at the beginning of the simulation.
TASK: Look at the lines below.
### SPECIFY TIMESTEP ### variable timestep equal 0.001 variable n_steps equal floor(100/${timestep}) variable n_steps equal floor(100/0.001) timestep ${timestep} timestep 0.001 ### RUN SIMULATION ### run ${n_steps} run 100000
The second line (starting "variable timestep...") tells LAMMPS that if it encounters the text ${timestep} on a subsequent line, it should replace it by the value given. In this case, the value ${timestep} is always replaced by 0.001. In light of this, what do you think the purpose of these lines is? Why not just write:
timestep 0.001 run 100000
The reason that the code is written like the former is because this means that the code can be re-used easily. for example, if you would want to run the simulation again but at a different time step, you only need to re-write one part of the code rather than having to search through the whole input script to manually change the time step each time you have specifically noted it down somewhere.
NJ: This section makes sure that you always simulate the same time (nsteps*dt), no matter what timestep is chosen.
TASK: make plots of the energy, temperature, and pressure, against time for the 0.001 timestep experiment (attach a picture to your report). Does the simulation reach equilibrium? How long does this take? When you have done this, make a single plot which shows the energy versus time for all of the timesteps (again, attach a picture to your report). Choosing a timestep is a balancing act: the shorter the timestep, the more accurately the results of your simulation will reflect the physical reality; short timesteps, however, mean that the same number of simulation steps cover a shorter amount of actual time, and this is very unhelpful if the process you want to study requires observation over a long time. Of the five timesteps that you used, which is the largest to give acceptable results? Which one of the five is a particularly bad choice? Why?
A plot of the energy, temperature and pressure vs time for the timestep of 0.001 can be seen below:
From Figure 6 we can see that the simulation does reach equilibrium as the average values of the temperature, pressure and energy are constant even though there are instantaneous fluctuations. The system reaches equilibrium very quickly, as can be seen from figure 7 below, the system has already reached equilibrium after a time of 10 ns.
NJ: Not ns, reduced units!
The next figure shows the energy vs time for all the timesteps:
NJ: This graph might be slightly clearer with (thin) lines rather than points.
The timestep 0.001 is the smallest timestep and so gives the most accurate results of the five. The next largest timestep is 0.0025, which as can be seen from the graph gives nearly the same results as the timestep 0.001. Once the timestep becomes even bigger to 0.0075, there is a deviation between the results obtained from the timesteps 0.001 and 0.0025. Therefore the largest timestep that gives acceptable results is 0.0025. The timestep that is a particularly bad choice is the timestep of 0.015. With this timestep the system does not even equilibriate.
NJ: Energy is not conserved
Running Simulations under Specific Conditions
TASK: Choose 5 temperatures (above the critical temperature ), and two pressures (you can get a good idea of what a reasonable pressure is in Lennard-Jones units by looking at the average pressure of your simulations from the last section). This gives ten phase points — five temperatures at each pressure. Create 10 copies of npt.in, and modify each to run a simulation at one of your chosen points. You should be able to use the results of the previous section to choose a timestep. Submit these ten jobs to the HPC portal. While you wait for them to finish, you should read the next section.
pressure 1 = 2.4 | pressure 2 = 2.7 |
temperature 1 = 1.5 | temperature 1 = 1.5 |
temperature 2 = 1.7 | temperature 2 = 1.7 |
temperature 3 = 1.9 | temperature 3 = 1.9 |
temperature 4 = 2.1 | temperature 4 = 2.1 |
temperature 5 = 2.3 | temperature 5 = 2.3 |
TASK: We need to choose so that the temperature is correct if we multiply every velocity . We can write two equations:
Solve these to determine .
To solve for requires us to rearrange the above equations.
Rearranging the first equation:
If we rearrange the second equation:
We can then bring out from the sum as it is a constant:
We then rearrange and solve for :
Finally we substitute for
TASK: Use the manual page to find out the importance of the three numbers 100 1000 100000. How often will values of the temperature, etc., be sampled for the average? How many measurements contribute to the average? Looking to the following line, how much time will you simulate?
The three numbers in the line of the script 'fix aves all ave/time 100 1000 100000 v_dens v_temp v_press v_dens2 v_temp2 v_press2' are the values for , and respectively. The values given for these variables specify what time steps are used to calculate the average. Therefore at each timestep which is a multiple of , the average will be computed from the last timesteps that have a gap of between them. In this specific case the temperature will be sampled every time step that is a multiple of 100000 for the average and 1000 measurements will contribute the the average. In this example, it will simulate a total time of 100000.
TASK: When your simulations have finished, download the log files as before. At the end of the log file, LAMMPS will output the values and errors for the pressure, temperature, and density . Use software of your choice to plot the density as a function of temperature for both of the pressures that you simulated. Your graph(s) should include error bars in both the x and y directions. You should also include a line corresponding to the density predicted by the ideal gas law at that pressure. Is your simulated density lower or higher? Justify this. Does the discrepancy increase or decrease with pressure?
Using excel I plotted density as a function of temperature and these figures can be seen below:
As can be seen from figures 9-11, the simulated density is lower than the densities predicted by the Ideal Gas Law. According to the Ideal Gas Law, the density is inversely proportional to the temperature at constant pressure.
The simulated density is lower than the density predicted by the Ideal Gas Law because of the two assumptions that the Ideal Gas Law is based on. These are assumptions are 1) that the gas particles are much smaller compared to the distances separating them from other gas particles and 2) that there are no attractive forces between the molecules. In the LAMMPS simulation we defined interaction potentials which already means that the simulation will not follow the Ideal Gas Law trend.
The discrepancy between the LAMMPS simulation and the Ideal Gas Law increases with pressure, as can be seen from figure 11. This is because at constant temperature, according to the ideal gas law, the density is directly proportional to the pressure. Therefore if the pressure increases, the density will also increase which causes the discrepancy to increase.#
NJ: What is it about these interactions that makes the system non-ideal? What about the behaviour of the discrepancy with temperature?
Calculating Heat Capacities
TASK: As in the last section, you need to run simulations at ten phase points. In this section, we will be in density-temperature phase space, rather than pressure-temperature phase space. The two densities required at and , and the temperature range is . Plot as a function of temperature, where is the volume of the simulation cell, for both of your densities (on the same graph). Is the trend the one you would expect? Attach an example of one of your input scripts to your report.
Below is the plotted heat capacity graph as a function of temperature:
As can be seen from figure 12 there are two trends which can be seen. The first trend is that at higher densities (0.8 vs 0.2), the heat capacity (at a certain temperature) is larger. This is because heat capacity is an extensive property and so if the density increases at constant volume then there will be more molecules in that volume which are able to adsorb the heat and so the heat capacity increases.
The second trend is that as the temperature increases the heat capacity decreases. The reason for this is less well understood however one school of thinking is that this is due to the loss of two transverse mode in the liquid with frequency where is the liquid relaxation time (DOI: http://dx.doi.org/10.1103/PhysRevB.78.104201). Another way of thinking is that as you increase the temperature, the energy levels in the system become closer together and so the difference in energy between the levels becomes smaller. As a result the particles adsorb less energy to move between levels and so the heat capacity decreases.
NJ: Good suggestion!
Example of input script:
### DEFINE SIMULATION BOX GEOMETRY ### lattice sc 0.2 #can change the density here between 0.2 and 0.8 as specified in the lab script region box block 0 15 0 15 0 15 create_box 1 box create_atoms 1 box ### DEFINE PHYSICAL PROPERTIES OF ATOMS ### mass 1 1.0 pair_style lj/cut/opt 3.0 pair_coeff 1 1 1.0 1.0 neighbor 2.0 bin ### SPECIFY THE REQUIRED THERMODYNAMIC STATE ### variable T equal 2.0 variable timestep equal 0.001 ### ASSIGN ATOMIC VELOCITIES ### velocity all create ${T} 12345 dist gaussian rot yes mom yes ### SPECIFY ENSEMBLE ### timestep ${timestep} fix nve all nve ### THERMODYNAMIC OUTPUT CONTROL ### thermo_style custom time etotal temp press thermo 10 ### RECORD TRAJECTORY ### dump traj all custom 1000 output-1 id x y z ### SPECIFY TIMESTEP ### ### RUN SIMULATION TO MELT CRYSTAL ### run 10000 unfix nve reset_timestep 0 ### BRING SYSTEM TO REQUIRED STATE ### variable tdamp equal ${timestep}*100 variable pdamp equal ${timestep}*1000 fix nvt all nvt temp ${T} ${T} ${tdamp} unfix nvt fix nve all nve run 100000 reset_timestep 0 ### MEASURE SYSTEM STATE ### thermo_style custom step etotal temp atoms vol variable temp2 equal temp*temp variable energy equal etotal variable energy2 equal etotal*etotal variable volume equal vol variable atoms2 equal atoms*atoms fix aves all ave/time 100 1000 100000 v_energy v_energy2 run 100000 variable aveenergy equal f_aves[1] variable aveenergy2 equal f_aves[2] variable heatcap equal ${atoms2}*(${aveenergy2}-(${aveenergy}*${aveenergy}))/${temp2} variable CvV equal ${atoms2}*(${aveenergy2}-(${aveenergy}*${aveenergy}))/(${temp2}*${volume}) print "Averages" print "--------" #print "Density: ${avedens}" #print "Stderr: ${errdens}" print "Temperature2: ${temp2}" #print "Stderr: ${errtemp}" #print "Pressure: ${avepress}" #print "Stderr: ${errpress}" print "AveEnergy: ${aveenergy}" print "AveEnergy2: ${aveenergy2}" print "Volume: ${volume}" print "Atoms2: ${atoms2}" print "Heat Capacity: ${heatcap}" print "Cv/V: ${CvV}"
NJ: You haven't used the temperature that was averaged over the whole simulation, you've used the instantaneous temperature from the final step. This will distort the results a little.
Structural Properties and the Radial Distribution function
TASK: perform simulations of the Lennard-Jones system in the three phases. When each is complete, download the trajectory and calculate and . Plot the RDFs for the three systems on the same axes, and attach a copy to your report. Discuss qualitatively the differences between the three RDFs, and what this tells you about the structure of the system in each phase. In the solid case, illustrate which lattice sites the first three peaks correspond to. What is the lattice spacing? What is the coordination number for each of the first three peaks?
Below are the plotted RDFs for the gas, liquid and solid phase.
The RDF describes how the density or the probability of finding a particle varies as a function of distance from a reference particle. As can be seen from figures 13 to 16 there are a few differences between the RDFs of the solid, liquid and gas. In the RDF graph for the gas there is one maxima which then immediately after decreases to 1. This means that as you move away from the reference particle in a gas, there is a high probability of finding another particle in the immediate vicinity, but as you move further out, the probability of finding another particle diminishes. In the liquid RDF graph, we see that a liquid has three well defined peaks which corresponds to the idea that the liquid has three reasonably well defined neighbors. Finally the solid RDF graph has many well defined peaks which illustrates that the solid has many well defined neighbours. The RDF graphs illustrate the degree of order within the system.
NJ: We generally classify the phases as follows: gas, no order; liquid, short range but no long range order; solid, short and long range order.
The peaks correspond to the nearest neighbours. Therefore the first peak in the solid corresponds to the first nearest neeighbours, the second peak corresponds to the second nearest neighbours and the third peak corresponds to the third nearest neighbours. The unit cell of a FCC lattice is shown below in figure 17. Using trigonometry it is possible to figure out that the distance to the closest neighbours is . The distance to the second closest neighbours is and the distance to the third closest neighbours is . Where is the lattice spacing. Therefore, the second peak corresponds to a distance of so
NJ: You could have used all three peak values to get three measurements of a, and then taken an average. At the moment your a is a little large.
Finally the average coordination numbers can be obtained by integrating the RDF to the corresponding minimum of the first three peaks. I have plotted the running integral of the solid RDF and by comparing it to the first, second and third minima x-coordinate values on figure 16 and comparing that coordinate value to extract the y-coordinate value on figure 18 gives the coordination number. So the coordination number of the first peak is 12, the coordination of the second peak is (18-12) = 6 and the coordination of the third peak is (42-18-12) = 12. NJ: Good, but the final one should be 42-18 = 24.
NJ: Lovely plot
Dynamical Properties and the Diffusion Coefficient
TASK: In the D subfolder, there is a file liq.in that will run a simulation at specified density and temperature to calculate the mean squared displacement and velocity autocorrelation function of your system. Run one of these simulations for a vapour, liquid, and solid. You have also been given some simulated data from much larger systems (approximately one million atoms). You will need these files later.
The density and temperature values were modified in order to simulate a solid, liquid and a gas. For the solid , for the liquid and for the gas
TASK: make a plot for each of your simulations (solid, liquid, and gas), showing the mean squared displacement (the "total" MSD) as a function of timestep. Are these as you would expect? Estimate in each case. Be careful with the units! Repeat this procedure for the MSD data that you were given from the one million atom simulations.

As can be seen from the figures above, as expected, the gas has the greatest amount of diffusion, followed by the liquid and finally the solid barely diffuses at all. Intuitively this makes sense because solids have a fixed position and are not able to move around very much. On the other side of the spectrum, gases have no fixed position and are free to move around without constraint.
In order to calculate from the graphs above, a line of best fit was drawn through the data points. For both the large and the small system in the gas and solid MSDs the line of best fit was only fitted to where the data was linear in order to get more accurate results. As the equation for is given by we can take the first derivative of the line of best fit, which is a constant and then multiply that by to get the value of .
The following results were obtained:
Phase | Small System | Large System |
---|---|---|
gas | 7.5485 | 3.1373 |
liquid | 0.0849 | 0.0873 |
solid | 0.0022 | 0.000005 |
As can be seen from the table, there is some discrepancy between the small and the large system for the values of between the gas and the solid. This may be due to several reasons. In the solid, if you zoom in on the part of the graph where the diffusion is "constant" we see that there is actually a lot of variation in the diffusion. This makes it difficult to draw a line of best fit and so there could be significant error that occurs here. As well as that, the discrepancies between the large and the small system may simply come down to the size of the system. The modelling of a large system will produce more accurate results as it is a better representation of a real life system.
NJ: I also didn't tell you what conditions I used for my big simulations, so they might be at different state points.
TASK: In the theoretical section at the beginning, the equation for the evolution of the position of a 1D harmonic oscillator as a function of time was given. Using this, evaluate the normalised velocity autocorrelation function for a 1D harmonic oscillator (it is analytic!):
Be sure to show your working in your writeup. On the same graph, with x range 0 to 500, plot with and the VACFs from your liquid and solid simulations. What do the minima in the VACFs for the liquid and solid system represent? Discuss the origin of the differences between the liquid and solid VACFs. The harmonic oscillator VACF is very different to the Lennard Jones solid and liquid. Why is this? Attach a copy of your plot to your writeup.
The first part of this task asks to evaluate the normalised velocity autocorrelation function for a 1D harmonic oscillator, which is given by the following function: . The first thing to do is to take the first derivative of this function, as the first derivative of position is velocity. So: This can be solved using the chain rule:
We can now evaluate
Using the trigonometric identity we can rewrite the term to give:
It is now possible to evaluate the integral:
We can take the constants outside of the integral and cancel the terms to leave:
We then expand out the numerator to:
Finally, we can split the fraction and take the constants outside the integral again:
Therefore as the integral in the first term reduces to and the numerator integral of the second term reduces to as the product of is an odd function and the integral of an odd function is if it evaluated by limits that a symmetric across the origin.
Figure 28 shows the plot of and the VACFs from the liquid and solid simulations. Plots of the VACFs can tell us useful information about the simulated system. At time 0, an atom will have a specific initial velocity, . If that atom experienced no interactions in the system, then according to Newtonian motion, that atom would retain its initial velocity forever. Therefore the plot of its VACF would be a straight horizontal line. However, in simulated systems where interaction parameters are set the VACF decays in such a way that reflect the decay of the correlation in atomic trajectories.
The minima in the VACFs for the liquid and solid system represent the maximum velocities achieved by the oscillations of the atoms within the system.
The difference between the liquid and solid VACFs is due to the nature of the order in the system as well as the interaction forces within the system. Solids are very ordered and have strong interaction forces and so the atomic trajectories will correlate quite effectively and oscillate strongly from positive to negative values. The function does still decay over time as there are still forces that act to to disrupt the oscillatory motion. Liquids on the other hand, have much more disorder than solids and do not have as strong interaction forces between the molecules in the system. So there is an initial amount of correlation in the particle trajectories however diffusive motion quickly interferes with any oscillatory motion.
NJ: The negative regions of the VACF correspond to collision with other atoms. The single minimum in the liquid VACF is due to collision with the "solvent cage".
The harmonic oscillator VACF is very different to the Lennard Jones solid and liquid because it assumes there is no forces that work against the oscillatory motion of the atoms in that perfect system.
NJ: There's nothing for the HO to collide with, so its velocity can never become decorrelated.
TASK: Use the trapezium rule to approximate the integral under the velocity autocorrelation function for the solid, liquid, and gas, and use these values to estimate in each case. You should make a plot of the running integral in each case. Are they as you expect? Repeat this procedure for the VACF data that you were given from the one million atom simulations. What do you think is the largest source of error in your estimates of D from the VACF?
To find using the VACF we need to use the equation . So we multiply the final value of the running integral by to obtain . The results are as follows:
Phase | Small System | Large System |
---|---|---|
gas | 8.843 | 3.268 |
liquid | 0.098 | 0.091 |
solid | 0.00034 | 0.00005 |
The plots of the running integrals are below:
The plots of the running integral are as expected.
The largest source of error in the estimates of come from the way of calculating the integral using the trapezoid rule, as the accuracy is limited to the size of the timestep.
NJ: How do these values compare to the MSD ones?