its quite simple really. imagine you have a stack of photos, where each photo is a frame of a video. if you then look at this stack from the side you would see something like the image above (well it would be mostly white since with real photos the layer with the image is really thin compared to the paper its on). where the train stopped you see only horizontal lines (the train car) because in that part of the stack all the photos are the same. unlike a normal photo where all parts of it are from the same moment in time, what you get here is that each column of pixels is from a different time (in the case above the stuff on the left is the oldest). because of this, the width of the objects in the picture tells you how fast/slow they traveled past the camera. fast is thin and slow is wide. most of the thin vertical lines are from poles and other stuff close to the track which passed by really fast, so the camera could only see them in one frame. the other extreme would be the car on the far right of the image which is extended because it was actually going uphill in reverse (wonder why). if you have a scanner you can try this. just move the stuff while you scan it.
i made this by taking one column of pixels (the middle one) from each frame of the video (in this case a live feed from the camera on my laptop) and then putting the columns next to each other from left to right to make the image.
the technique is called slit scan because in the old days they did it by putting a board with a thin slit in front of the camera and move the board while exposing.