The file flow-occ contains data collected by loop detectors at a particular location of eastbound Interstate 80 in Sacramento, California, from March 14–20, 2003. (Source: http://pems.eecs.berkeley.edu/) For each of three lanes, the flow (the number of cars) and the occupancy (the percentage of time a car was over the loop) were recorded in successive five minute intervals. There were 1740 such five-minute intervals. Lane 1 is the farthest left lane, lane 2 is in the center, and lane 3 is the farthest right. a. For each station, plot flow and occupancy versus time. Explain the patterns you see. Can you deduce from the plots what the days of the week were? b. Compare the flows in the three lanes by making parallel boxplots. Which lane typically serves the most traffic? c. Examine the relationships of the flows in the three lanes by making scatterplots. Can you explain the patterns you see? Are statements of the form, “The flow in lane 2 is typically about 50% higher than in lane 3,” accurate descriptions of the relationships? d. Occupancy can be viewed as a measure of congestion. Find the mean and median occupancy in each of the three lanes. Do you think that the distributions of occupancy are symmetric or skewed? Why? e. Make histograms of the occupancies, varying the number of bins. What number of bins seems to give good representations for the shapes of the distributions? Are there any unusual features, and if so, how might they be explained? f. Make plots to support or refute the statement, “When one lane is congested, the others are, too.” g. Flow can be regarded as a measure of the throughput of the system. How does this throughput depend on congestion? Consider the following conjecture: “When very few cars are on the road, flow is small and so is congestion.“When very few cars are on the road, flow is small and so is congestion. Adding a few more cars may increase congestion but not enough so that velocity is decreased, so flow will also increase. Beyond some point, increasing occupancy (congestion) will decrease velocity, but since there will then be more cars in total, flow will still continue to increase.” Does this seem plausible to you? Plot flow versus occupancy for each of the three lanes. Does this conjecture appear to be true? Can you explain what you see? Is the relationship of flow to occupancy the same in all lanes? h. This and the following exercises require the use of dynamic graphics, e.g., http://www.ggobi.org/. Make time series plots of all the variables. Consider lane 1. Make a one-dimensional display of occupancy and vary the smoothness until you can see some distinct modes. Use brushing to determine when in the time series plots those modes occured. Do the same for flow and then examine some other lanes. i. Choose a lane and make one-dimensional displays for flow and occupancy and a scatterplot of flow versus occupancy. Use brushing to simultaneously identify regions in the three plots. Does what you see make sense? j. From scatterplots of flow versus occupancy, examine when different regions of this scatterplot occur in time. In particular, identify when in the time series plots the flow breaks down because a critical point is reached. k. You have now seen that all these variables, flow and occupancy in each of the three lanes, are closely related, but because scatterplots are two-dimensional, you have been able to examine only those relationships between pairs of variables. In these scatterplots, the points tend to lie along curves. What happens in higher dimensions? i. Examine the relationship of the three flows. In three dimensions, do the points tend to lie along a curve (a one-dimensional object), or do they tend to concentrate on a two-dimensional manifold, or are they scattered over three dimensions? ii. Examine the relationships of the three occupancies. In three dimensions, do the points tend to lie along a curve (a one-dimensional object), or do they tend to concentrate on a two-dimensional manifold, or are they scattered over three dimensions? iii. How do the points lie in six dimensions (three flows and three occupancies)? When do different regions occur in time? l. A taxi driver claims that when traffic breaks down, the fast lane breaks down first so he moves immediately to the right lane. Can you see any such phenomena in the data?