General Relativity For Undergrads

Starting at senior undergraduate level, there are only a few things one needs to learn to gain a much deeper understanding of GR than is given in lower division modern physics courses. I will present them here. The problems I give should go quickly, and I walk you through them. Since there are so few, I encourage you to write down some of the things in the main text with your own hand. There is something to seeing yourself writing it down that brings a new level of understanding.


Contents:

1) The Idea of a Manifold

2) Vector Transformations Under Changes of Coordinate Systems (a review)

3) Tensors and Cotensors (the language of relativity)

4) Space-time Metric (how time and distance are measured)

5) The Equivalence Priciple and Doing Physics

6) Tensor calculus

7) Einstein's Equation

Here goes. Feel free to send me your questions. Or my mistakes.



1)What is a manifold?

A manifold is a slightly more general type of coordinate space than the Euclidian space we all know and love. That is, a space that we can put a coordinate system on. The generalization is that the coordinate system need not be the same everywhere, and this allows us to describe positions in weirder spaces. As an example of a coordinate system that is not the same everywhere imagine a 2-d plane where the first quadrant has a rectangular coordinate system (including the axes) and the second, third and fourth quadrant (excluding the axes of the rectangular system) have a polar coordinate system. This is still Euclidian 2 space, and it has a good coordinate system such that any point in the space can be UNIQUELY described with the coordinate system. (from now on I will reffer to the local means of describing a point a coordinate patch. As an example the rectanglar coordinate patch in the first quadrent given here. I will use the term coordinate system as our means of describing a point somewhere in the space. That is, all the coordinate patches together. There is no need to distingush the two terms when we have one coordinate patch covering the entire space.)

Also, we could have the parts of the coordinate system overlap. Like, rectangular coordinates in quadrants one and two, polar coordinates in quadrants two, three and four. Where the coordinate systems overlap we can write equations transforming the polar coordinates to reactangular coordinates and vice versa. The points in the overlap region are still uniquely described even though you can specify a given point in the region with either polar or rectangular coordinates. This is because of the identification (saying one thing IS another) these transformation equations give (a subtle point but worth pondering and understanding).

To see why dropping this 'coordinates the same everywhere' requirement allows us to put coordinate systems on weirder spaces, consider the surface of a sphere. Longitude and co-latitude (theta=0 at north pole, 90 at equator and 180 at south pole as in spherical coordinates) do not describe every pont on a spheres surface uniquely because at co-latitude 0 any longitude will describe the same point, the north pole. Same for co-latitude=pi and any longitude, the south pole. But we CAN uniquely describe every point on the sphere with a coordiate system broken into three patches. The co-latitude longitude system (excluding the poles), a rectangular coordinate system projected onto the northern hemisphere and another rectangular coordinate system projected onto the southern hemisphere. After we give transformation equations between coordinate patches, we have a way to uniquely describe any point in the space even though it could not be done with a single coordinate patch.

So, a manifold is a space wherein we can set up coordinate patches that allow us to uniquesly describe any point in the space. Any set of coordinate patches works.

It is an assumption of GR that space-time exists and is a manifold. This is a hard idea to grasp, that the space we live in need not be describable using a single coordinate patch. I think it is imposable for us to picture any space other than euclidian space that is not a inbedding in a higher dimensional space (like the surface of a sphere is a 2-d space we picture as a subset of a 3-d space, this is often reffered to as an imbedding). Sorry. But his does tell us something important about the space we live in, locally we can describe points in rectangular coordinates. That is, when we set up a coordinate system, seems we can always find a way to do it with only rectangular coordinate patches. why we cant just sew the coordinate patches together to get one big coordinate patch that covers the entire universe is another thing I dont think we can picture. Just keep in mind the idea of a rectangular coordinate patch around you but no extending to infinitely far away.

PROBLEMS:
1. Is a circle a manifold? Is a figure 8? Give a coordinate system for a figure 8.
2. Come up with the transformation equations between the lat-longitude (theta,phi) and north rectangular (xN,yN) coordinates described above. Are these equations infinitely differentiable? If so, the manifold is called a cinfinity manifold. Can you think of another manifold which is not cinfinity?
3. Come up with a single coordinate system that gives a unique description of every point on a sphere. If you get one, can I be coauthor on your paper that is going to blow all the mathematicians away? They think it's impossable.

2) Vector Transformations Under Changes of Coordinate Systems

You Should have seen Jacobians before. Here is a review:
When one changes coordinate systems, for example a rotation or a rectangular system on euclidian 2 space, there are transformation equations
x'=x cos(t) + y sin(t)
y'= -x sin(t) + x cos(t)
The Jacobian of this transformation is a matrix such that when it acts upon a vector from the untransformed space you get the same vector in the transformed space. The determinate of the Jacobian gives the factor by which a element of volume changes size upon the change of coordinates. For the afore mentioned rotation this determinate is 1, as you would expect.

The i,j complonent of the Jacobian is dx'i/dxj where the x'i are the transformed coordinates and the xj are the untransformed coordinates.

This next part is extreemly important! As you are used to seeing in linear algebra transformed vector can be written v'=Jv. We can write each component as a sum: v'i=sumj=1 to n(dx'i/dxj)vj, but I don't know how to make a capitol sigma summation symbol in html, and I hate writing them on paper since it's freakin obvious that since there are no j subscripts on the left of the equation something happened to make them disappear. Einstein agrees with me, so he started the convention that when two things with the same subscript are multiplied together it is implied that you sum over that subscript from one to n when the dimension of the space you are working in is n. So we can just write v'i=(dx'i/dxj)vj. Easy schmeezy. This might seem like it saves very little ink right now, but just wait. GR has sumations galore and this implied summation convention makes the difference between seeing an equation that makes you crap your pants and seeing the same equation and actually understanding something!

PROBLEMS:
1. Find the Jacobian of the rotation discussed above.
2. Start ith a rectangular coordinate system which covers all of your Euclidian space. Consider a constant vector field. That is, the same vector is assingned to each point in space. Transform to spherical coordinates and use the Jacobian of the transformation to transform the vector field. What is happening far from the origin? Consider the determinate of the Jacobian there.

3. Show that a Lorentz boost is simply a rotation of a 4 component coordinate system. Start with an unprimmed rectangular coordinate system x1=x, x2=y, x3=z, x4=ict, rotate it such that the x'1 axis makes an angle t witht the x1 axis and the x'4 axis makes an angle t with the x4 axis and the other axes remain stationary. This can be drawn like a rotaton in a plane (just dont draw the axes that dont change) and the transformation equations will look very similar. Examine what happens to various planes from the unprimed system in the primed system. Start with the y=0 plane. Now try the x=0 plane. Write x4 as ict and take the time derivative of the position of the x=0 planein the unprimed reference frame. Call this v. Determine what sin(t) and cos(t) must be eaqual to. Now rewrite the transformation equations that looked like a planar rotation substituting in what you now know to be the sin(t) and cos(t) values and writing everything in terms of x', y', z', and t'. Kaboom, the Lorentz reansformations. 4) What is the determinant of the Jacobian for the transformation in problem 3? Do the math, sissy.

3) Tensors and Cotensors

The question of what a tensor is haunted me through most of my undergraduate career. It's really not so bad though. Think of a vector field, like an electric field due to a point charge at the origin of some coordinate system. At every point in space there is a vector. Now a subtlie point: does the vector exist at the point? Does its tail start there? Well, what would a vector with units of force per charge be doing sittng in a space with a coordinate sytem that has units of distance? No, the vector corresponds to the point in the following sense: we plug the the coordinates of the point in space into the equation giving the field to find the vector corresponding to that point. We plug in n numbers and get out n numbers.

A vector is an example of a local geometric object. A local geometric object is defined using a point in space. If I assigned a different fruit to each pont in space, or even the same fruit to every point in space than that fruit would be a local geometric object. Vectors have the property that their transformation law under change of coordinates is a linear equation, as discussed above.

TENSORS ARE local geometric objects with linear tansformation equations under change of coordinates.

So yes, all vectors are tensors, but there are more kinds of tensors than just vectors. But I feel it neccesary to point something out: vectors are often written as coulumn matricles. This does not mean that all column matricies are vectors! I'll leave it to you to come up with a column matrix which does not transform as a vector (hint: give some function of the untransformed spacial coordinates for position 1,1 another function for 2,1 and so on (or stop there) then write these funtions in terms of the transformed coordinates, transform using the Jacobian, then pick a point in the untransformed space, find the componets of the untransformed matrix there, find the transformed coordinates of the point, plug them into the transformed matrix components and compare the teansformed and untransformed matricies. Dont compare components, that is not fair. Try maybe 'magnitudes' which should be invariant. Other less complex examples work.)

Lets talk about some vectors we know.

Velocity is a vector. Say we have some pointlike body moving along some arbitrary path described by the n equations xi=xi(t) where t is time. The components of its velocity vector are then vi=dxi/dt. How does this transform? Using the total differential vi=(dxi/dx'j)dx'j/dt =(dxi/dx'j)v'j. Changing the primed and unprimed coordinates (just a change in notation) gives the transformation from unprimed to primed coordinates like in section 2:

v'i=(dx'i/dxj)vj.

The gradient of a scalar field is another vector we know and love. Remember that a scalar field is an assignment of a number (only one this time as opposed to n for a vector field) to each point in space, this number is another example of a local geometric object! If the scalar field is given by f(x) the gradient vector is given by vi=df/dxi. Now how does this transform? Lets first consider how f(x) transforms. To get f(x) in the transformed system just write f(x(x'))=f'(x') where f' has a different functional form than f but f evaluated at x equals f' evaluated at x' where x and x' reffer to the same point in space. So again using the total differentail, the transformation of the gradient is
df/dxi=(dx'j/dxi)(df(x(x'))/dx'j) =(dx'j/dxi)(df'(x')/dx'j)
and when we swich the primmed and unprimmed coordinates (just a change of notation) and call the gradient vector v

v'i=(dxj/dx'i)vj

Compare the transformation laws for velocity vectors and gradient vectors. Yous hould have that feeling in your stomch like I did something wrong. Why would these two kinds of vectors have different transformation laws? Precisely because they ARE two different kinds of vectors! Notice that in the velocity vector the spatial differential is in te numerator whereas in the gradient the differential is in the denominator. As you can see by reviewing my use of total differentials to obtain the transformation equations above, this is what is responsable for the difference. Any vector that transforms like velocity is called a contravariant tensor (aka cotensor, covector and any vector that transforms like a gradient is called a contravariant tensor (aka cotensor, covector). So that we can tell weather a vector is a co or contra tensor it is convention that tensors are always written in component form with contratensors having superscripts and cotensors having subscripts (my mnemonic is co means lo as in subscript). Also, multiplying the velocity vector equations by dt we see that the differential dxi is a component of a contratensor and so MUST be written dxi. So the transformation equations are

contravector: v'i=(dx'i/dxj)vj.

covector: v'i=(dxj/dx'i)vj

Notice that the repeated indecies now appear one supperscript one subscript (a superscript in the dencominator is a subscript subscript). This will always happen and is part ofthe amazing beauty of the Einstein summation convention. I now ask for forgiveness from those that have seen this before for the 'poor notation' I have used untill now, but it is my belief that without seeing why supperscripts indecies are neccesary students including myself tend to revert to the default notation of supperscripts implying exponentiation. Thus I justified notation novel to the student before using it.

You now may be wondering why you have never seen these two different transformation equations before, after all you have doubtlessly done E&M problems which involved transforming electric fields (which are gradients of a scalar field and so cotensors, the less familair of the two). The answer is, you have not done weird enough transformations. If you do a transformation which has an orthogonal jacobian than the two transformation 'matricies' are identical. Here is the math:

The orthogonality condition is JTJ=1 or JT=J-1
The transformation is v'i=(dx'i/dxj)vj
so the transpose of the Jacobian is (dx'i/dxj)T=dx'j/dxi (just swich the indecies)
And the inverse transformation is vi=(dxi/dx'j)v'j
so the inverse of the Jacoian is dxi/dx'j
Substituting the orthogonality condition into the inverse transformation gives
vi=(dxi/dx'j)v'j =(dx'j/dxi)v'j
So one could use the transoformation laws for covectors on contravectors (and vice versa) and hence there is no need to distinguish between the two. (The high and low indecies do not match up right in this equation, this is okay only in the case that the two types of tensors are indistinguishable under the specific transformation.)

So now, what are the opperations one can use on tensors? Lets start with multiplication. Scalar multiplication is fairly obvous, c(vi) =cvi. But multiplication of tensors by each other is going to be more comlicated. Lets multily a covector by a contravector and see what could hapen. AiBi is going to to be a scalar after we do the sum. AiBj however has no repeated index. This is called a tensor product. Lets use the notation AiBj=Cij. How is this beast going to transform? Well, we know how to transform the factors so C'ij =(dx'i/dxk)Ak* (dxl/dx'j)Bl =(dx'i/dxk) (dxl/dx'j)Ckl
This is a linear transformation law! So this thing is also a tensor! It has both a sub and superscript so it is neither contravariant nor covariant, it is called a mixed tensor. Since it has a total of two indecies it is called a rank 2 tensor. Vectors (both types) are called rank one tensors, scalars are rank zero tensors. One can now imagine that you could tensor multiply any number n of contra and covectors and get a rank n tensor. There will be one Jacobian factor in the transformation law for each rank one tensor factor. Even if a higher rank tensor cannot be decomposed into rank one tensors, as long as it follows a linear transformation law (which will always be of the n jacobian factors form, poof too long) it can be used. If all the indecies are supperscripts it is a rank n contravariant tensor, if all indecies are subscripts it is a rank n cotensor. If not all of the indicies are superscripts it is a mixed tensor. (I have intentionally used all the aliases for tensors in this paragraph to get the reader familar with as many of them as possable.)

You now know what a tensor is.

Now the question 'Why are tensors so important?" The answer is, because of their tansformation properties, a tensor equation that is true in one coordinate system must be true in any other coordinate system used! Think aout it, if you have a tensor equation and transform it to another coordinate system than you have Jacoban factors hanging out on both sides of the equation and you can just cancel them (not by dividing both sides! subtract the right from both sides so you have JJJ(left-right)=0 and you know the jacobian is not zero so left = right.) Since GR wants to generalize from describing things in inertial frames to describing physics in ANY frame in any kind of manifold, the language of tensors is ideal!

4) The Space Time Metric

The dot product of two vectors is invariant under transformations. Likewise A'iB'i =(dx'i/dxk)Ak* (dxl/dx'i)Bl =@lkAkBl =AlBl where @ is the Kroneker delta.

The distace between two points differentaily close (often reffered to as neighboring points) in a rectangular coordinate system on a Euclidian space is ds=(dx2+dy2+dz2)1/2. But what is we swich to a sphereical coordinate system?
x=rsin(t)cos(p), y=rsin(t)sin(p), z=rcos(t)
so dx=sin(t)cos(p)dr+rcos(t)cos(p)dt-rsin(t)sin(p), dy=sin(t)sin(p)dr+rcos(t)sin(p)dt+rsin(t)cos(p), dz=cos(t)dr-rsin(t)dt
and substituting gives ds2=(bla bla bla)dr2+(bla bla)drdt+(bla bla)drdp+(bla bla)dt2+bla bla bla
The point is, there are cross terms and there are a lot of coordinate system we may choose and only one, the rectangular system, has no cross terms. So, again, if we want to talk about physics which is independant of coordinate systems we can say ds2=gijdxidxj where the components of g are whatever are needed to make the statement true. Notice this sum over two indecies covers all the possable cross terms. Note also that there is a term for drdt and one for dtdr, so the appropriate g element for one of these terms is 1/2 the (bla bla) above. Also notice that g is symetric, that is gij=gji.

Recall that the invariant interval between neighboring events in special relativity (with rectangular coordinates) is ds2=dx2+dy2+dz2-(cdt)2 so there g=1,1,1,-c2 across the diagonal, 0 off diagonal. A space likethis is called Minkowskian (as opposed to Euclidian which has 1s all across the diagonal, 0s off.) This is a direct result of the postulate that the speed of light is the same in all reference frames. The same assumption is made in general relativity, and thus all g are similar to (there exists a similarity transformation between) the special relativity g above. This just means that we can trnsform any coordinate patch into rectangular coordinates. g is called the spacetime metric because it is essential in describing distances.

Hold up yo! AlBl is invariant and so is ds2=gijdxidxj. Couldn't we divide both sides by dt2 and choose neighboring points along some arbitrary curve parameterized by t, and thus it must be true for any contravectors E and F gijEiFj= invariant constant. Thus given a contavariant vector A with components Ai we can define a covariant vector which we will also call A with components gijAj=Ai. So, the metric also acts as an index lowering device. This also works on general higher ranked tensors to lower one of the supperscripts. Does g have an kind of inverse that raises indecies? Yes, and its inverse is precisely what the index lowering tensor is. For notice that in
gijEiFj= invariant constant
we could interpret this as lowering the index of E then summing or lowering the index of the F then summing, so
EiFi=EiFi
and if gij is the index raising tensor we must then have
EiFi =gikEkgilFi =gikgilEkFi
So we must have gikgil=@kl, and since the Kroneker delta has components of the identity matrix, the index raising tensor is the matrix inverse of the index lowering tensor, which is also the space time metric.

PROBLEMS:
1. Find the bla bla bla terms for the distance between two points in spherical coordinates and arange them in a matrix gij.
2. Show that if Ai is a tensor and Bi is some collection of numbers not know to be a tensor and if AiBi is an invariant constant then Bi is must be a tensor. Use the definition of a tensor, and transform invariant constant and A to some new coordinate system to determine the transformation equation of B. (Hint:what do you need to get the Kronecker delta?)

5) The Equivalence Principle and Doing Physics

On a philisophical note, what is it to be a distance? It would be nice to describe this only in terms of tangable objects (ie. not in terms of points of space or differential intervals, which are abstract notions). We can say that two points of space are ten feet apart if when a test particle is placed at both points ten rulers (as identical as indistinguishable particles and we define rulers to be 1 foot long) can be placed in a continuous string such that if the string is to touch both the test particles then the touching must occur at the ends of the string. It is tough to define distance without reference to physical bodies. Whenever we talk about a coordinate system we are reffering to some grid of rulers we could set up