SRT Data Reduction Exercise

Assumptions:

Introduction

In this exercise, we will be working with actual data taken from MIT Haystack Observatory’s SRT (Small Radio Telescope).  This data was taken at the Haystack observatory by pointing the SRT at the Orion Nebula in the summer of 2005.  Frequently, radio astronomers look out into space for a specific signal.  The most common element in the universe is hydrogen, and hydrogen has a unique radio frequency signature at 1420.406 MHz.

v      Question 1:     What causes hydrogen to emit this particular frequency of radiation?

Not only can denser regions of hydrogen be located by radio astronomy, but their relative velocity to Earth can be determine by how much the hydrogen line has been Doppler shifted.  In essence, by observing a certain frequency from a hydrogen source, we are observing its relative velocity as well.

v      Question 2:    Rewrite the standard Doppler shift formula so that you have an expression for the velocity of the hydrogen source as a function of the observed frequency, the known speed of light, and the known frequency of stationary hydrogen.  State what units your velocity will have.  Did you make any assumptions in your derivation?

Part I

Obtain two files from your instructor, or download them yourself from http://web.haystack.mit.edu/SRT/srtprojects.html.  The two text files contain data exactly as created by the SRT software.  Look at the files with a text editor and note that there is a near incomprehensible stream of numbers.  These numbers represent the response of the telescope at various frequencies.  Our goal is to take this raw data and get meaningful graphs and a meaningful result. 

First, we must be able to manipulate this data.  One of the best places to do this is in a spreadsheet like Microsoft Excel.  Using Excel (or a similar spreadsheet application), open the file called “data1.txt”.  The data is delimited; when opening your file, be sure to indicate that the data is “space”, ”tab”, and “colon” (”:”) delimited.

v      Question 3:    How many rows and columns of data do you now have?

The SRT records data in a very specific sequence.  You may wish to label the columns in your spreadsheet for your own reference:

 

Column Number

Data

1

Year data was taken

2

Day data was taken

3

Hour data was taken

4

Minute data was taken

5

Second data was taken

6

Azimuth

7

Elevation

8

Offset

9

Offset

10

Lowest frequency data

11

Frequency increment for each additional data column

12

Digital mode of the SRT

13

Number of frequency bins

14…

Raw data columns

v      Question 4:    In what month of 2005 was this data taken?

v      Question 5:    What does “azimuth” mean in this situation?

v      Question 6:    What were the offsets in this data set?

Once you are satisfied that you have the raw data properly identified, save your spreadsheet as “YourLastName1”.  This file and the answers to the questions in this handout should be turned in as part of your lab report. 

Now your task is to make a graph of the raw data.  Selecting the data only (exclude column titles and the first 13 columns), make a line graph.  You should have the intensity of the signal as the y-axis, and the number of the data bin (or column number) as the x-axis.  Save a properly labeled copy of this graph in your spreadsheet.  Before continuing, ask your instructor to verify that everything looks okay.

No radio receiver is perfect.  You should see in this graph that the ends of the graph are sloping up or sloping down and are off from the rest of the graph.  These data points have been compromised by the real world limitation of receiver response.  By looking carefully at the data, you can decide how many data points to eliminate at the beginning of the set and at the end.  Create a new line of only the remaining “clean” data in your spreadsheet.  Do not erase your original data!

v      Question 7:    How many data points did you eliminate from (a) the beginning and (b) the end of the original data set?

Our next step is to turn this graph into a true spectrum by changing the x-axis to frequency rather than data point number.  Note that column 10 is the frequency of the first raw data point, and that each column afterwards is the previous column’s frequency plus column 11 (the incremental increase in frequency).  Below your row of “clean” data, create another row that specifies exactly the frequency of each data point.  Be sure to use the built-in calculator feature of the spreadsheet; do not determine each data point’s frequency by hand!

v      Question 8:    What is the frequency of your last “clean” data point?

To make our new graph, you now should select both groups of numbers (the clean data and their corresponding frequencies) and make an x-y scatter plot.  Carefully label the axes of the graph and save it in your spreadsheet as well.

v      Question 9:    What is the peak frequency in your spectrum?

v      Question 10: Does this mean Orion is moving towards the Earth or away from the Earth?  Use your answer to Question 2 to determine a speed.

Part II

The graphs in part one are a bit “noisy”.  If you look at it closely, even the graph with the “clean” data is not smooth.  In general, there are two kinds of errors: random and systematic.  Taking more data and averaging the data can correct for random errors.  This should result in smoother results, as truly random errors should fluctuate both up and down and thus cancel out over the long term.  Systematic errors, on the other hand, require a more detailed analysis.

The other data set, “data2.txt”, is the same as Part I’s data set in every respect, except that the antenna took data for a longer period of time, and thus generated multiple data sets.  You may notice some messages from the SRT to the data file; these messages can be deleted.  Follow the same procedure used in Part I to import the data file into a spreadsheet.  Save this file as “YourLastName2”.

v      Question 11:   For how long was the SRT taking data?

v      Question 12: How many rows of data do you have this time?

v      Question 13: Why do the azimuth and elevation change slightly as this data is being recorded?

Your first step is to average all the raw data.  Create a new row of data below all the rest.  Use the built-in features of the spreadsheet to fill in the data columns with the average of all of the raw data in each column.

Using only these averaged results, repeat Part I’s procedure of graphing intensity vs. frequency (a spectrogram).  Your result should be similar to Part I’s, only smoother.  Note that this is an important point.  One data point may look like noise but several averaged together may reveal a pattern!  Label this graph and save it in your spreadsheet.

v      Question 14: Describe the shape of your graph.  What can you say about the “baseline”?  (The baseline is the background signal that is not part of the peak.)

The baseline should be a straight line and should be normalized to zero.  We are now going to correct for this systematic error.  You should figure out a way to “normalize” your data such that the peak looks fairly symmetric and the baseline is flat and close to zero.  The simplest correction would be to fit a straight line to the current baseline and, at each point, subtract the value of that line from your clean data. 

Baseline

 

 

Now you will have “squeaky clean” data.  Construct a new spectrogram with this normalized data (that has been corrected for both random and systematic error as much as possible).

v      Question 15:

(a)             What is your peak frequency on this data?

(b)             By what percent did your peak frequency shift after applying the correction to the baseline?

v      Question 16: Use your answer to Question 2 to determine the speed of Orion relative to the Earth.

v      Question 17: Compare your resultant speed from Part II to that obtained in Part I.  What can you conclude about data analysis and data reduction?

v      Question 18: Sometimes the error of a peak measurement is characterized by the width of that peak at half of its maximum.  Determine the possible error on your speed of Orion value using this technique.

v      Question 19: (Review your answer to Question 2.)  The calculated error from Question 18 is really only relative to the peak, not relative to the actual speed of Orion.  What final correction would have to be made to the result to determine the actual speed of Orion?