Prerequisites: high school level physics, undergraduate level math

One important idea in physics is that every state that the universe can be in will lead to exactly one other state, and every state will be the result of one other state. So if S is the set of all universe states, then there is some “do physics!” function f that is bijective from S to S. And if you know that the universe is at some point in state x, then by repeated applying f or its inverse like f(f(f…(x))), then you can calculate every state the universe has or will pass through.

  • Classically, this state is the position and velocity of every particle in the universe at a certain time. In special relativity, it’s the position and velocity of every particle along a spacelike slice of the universe. In quantum field theory, it’s the amplitude of every possible field configuration*. In general relativity I don’t know, I believe the only way we can solve it right now is by using symmetry and making linear approximations.
  • *In other words, if x is the set of all positions in a spacelike slice, and y is a tensor/spinor field, then a field configuration is a function from a position in x to an element of y. If f is the set of all field configurations, then a state is a function from any configuration in f to a complex number. And S is the set of all states.

So if you have a single state and run physics, you get another state. But if you have a group of states and run physics on each of them, you get another group of the same size. Actually, this is not always true when the states are continuous. For example, if you have f(x)=x/2, the image of (0,1) is (0,1/2) which has lower measure. However it seems to be true for the processes of physics, due to quantum effects.

Usually, we do not know the exact state, but rather that it is one of many states inside a certain set, for example, if we have a bucket of water, we can’t measure every water molecule but we can determine that this water is in a state with a certain temperature and pressure, and in a volume shaped approximately like a bucket. This set is called a macrostate, and the number of states in this set determines the macrostate’s entropy.

  • Suppose you have a universe of room temperature water with no outside interaction, and lets say you were coming up with a new theory of physics regarding how this water will behave. Then to follow this rule, you can’t say “after 5 seconds, all the water on the left will be cold and it will be all hot on the right”, because there has to be an equal number of states before and after the 5 seconds passed. However you could say “after 5 seconds, these few specific states of room temperature water will separate into hot and cold, while the other states will do something else”.
  • Similarly, someone might come up with “Maxwell’s demon” where a clever contraption interacts with a box of particles and appears to break the laws of thermodynamics. However if you consider the whole system of both the contraption and the box, you know the number of states has to remain constant all the time.

Ok so now we have reasonable justification that the entropy can’t decrease. But why does it increase? From some perspective, it doesn’t increase. If you have 10 states and run physics, then after 1 second you will still have 10 states. However since we are dealing with systems with so many particles interacting in complicated ways, states tend to evolve chaotically, meaning to states that were very similar in terms of qualities we can measure (such as position) will quickly diverge and the 10 states we started off with will eventually seem uniformly distributed among all possible states. Since we cannot measure things with infinite precision, there will be many more states that are so close to these 10 starting states that we cannot tell them apart.

This reasoning assumes that each state evolves randomly. It also doesn’t say anything about how quickly these states spread out through the set of all states. Things don’t just immediately go into their highest entropy macrostate.

So entropy never decreases. But only with a large, complicated system and sufficient time does it seem to increase (with high probability).

Wow, I did not mean to spend so long writin this. How has it been an hour already?? This is not fun :(