Earth RSE
(Remote Sensing for Everyone)
Formerly
GooberEarth
video - 20
minute Windows Media 9 video with narration
What it is:
Earth RSE is
a 3D planetary science platform. It is designed to fill the space
between mainstream 3D mapping software like Google Earth, MS Virtual
Earth 3D, and NASA WorldWind, and expensive and complicated
professional packages like ENVI, ArcInfo, and ERDAS. Earth RSE
supports:
- full
geodetic referencing
- a
pole-free coordinate system
- topography
simplification and bitstream coding
- realtime
vertical scaling
- anisotropic
data selection
- dataset
dependency management
- multithreading
and task queuing
- multi-spectral
image PC decorrelation (using SERM lifting) and JPEG2000 compression
- large
oriented point sets
- multiple
simultaneous texture sets for temporal analysis and blending
Infrastructure for
facilitating more
powerful programmable analysis and complex datasets, namely
asynchronous CPU texture
compositing, is
being developed. A compiler with shader-like semantics will drive
both dataset binding and expression evaluation, exposing a variety of
analytical capabilities while keeping a thin user interface.
Earth RSE matches the consumer
products for ease-of-use by integrating global datasets into a seamless
3D world, yet supports the georeferencing and data management features
necessary to build serious remote sensing applications. The Earth
RSE engine is still in the late-stages of development, but
plenty advanced for demonstration. We are currently looking for
funding for further development.
The main goal of the project is to process and stream to the public the
complete archives of Landsat 5, 7, MODIS, ASTER, and Hyperion over all
spectral bands, along with easy-to-use analysis tools, providing a
continuous 20-year history of planetary change.
What it does and
how it works:
Georeferencing
The system
supports full datum (eight-parameter
reference spheroid plus geoid offsets) georeferencing for
the input and output of data. A
geocentric coordinate system tracks the x- and y-components of unit
vectors in hierarchies of z-pointing local coordinate systems. Topographic
raster data is fed through a preprocessor and a hierarchy of simplified
topography chunks is produced. The preprocessor shoots rays from
the center of the earth and finds the point of intersection with the
default reference spheroid (WGS84), then shoots out a ray normal to the
spheroid surface and intersects input datasets, starting with the
highest resolution sets and moving to
the lowest resolution sets, as each query fails. Upon loading a
raster section, the geoid offsets for the dataset are added to the
decoded values. Sampling of
groundcover works similarly.
On the client
side, uniform vertical scaling is made possible by datum shifting each
vertex, after they are deserialized, into the scaling reference
spheroid coordinate system. (This is WGS84 by default.) The
closest point on the client
spheroid is interleaved into a vertex buffer, along with the geodetic
height from the spheroid to the unscaled vertex. If geoid
referencing is selected, the latitude and longitude of the spheroid
point is computed and the geoid offset is bicubically sampled,
interleaved into the vertex buffer, and subtracted from the geodetic
height. In the vertex shader the spheroid normal at the spheroid
point is computed (this is done trivially as the normal of a quadratic
surface is
computed by taking its gradient), scaled by the geoid offset plus the
geodetic height times the scaling constant, and added back into the
spheroid point.
In addition to
supporting full geodetic referencing, Earth RSE has abandoned the
geographic coordinate system used in most other programs, in which
vertices are spaced along equally intervals of latitude and
longitude. The geographic system, while easy to work with, has
two significant disadvantages. First, chunks of topography or
groundcover become less and less square as latitudes tend towards the
poles, resulting in a sub-optimal distribution of data over the surface
of the globe. Second, the geometry becomes unruly or pathological
at the poles, where many thousands of very thin triangles converge.
The two images
below demonstrate problems with the geographic coordinate system used
in Google Earth. The image on the left shows chunks (groups of
eight triangles sharing a vertex) becoming horizontally compressed as
the latitude increases towards the pole. On the right we see that
as the north pole is approached, increasing numbers of thin triangles
are
rendered with decreasing image quality:
MS Virtual Earth
3D also uses a coordinate system that is singular at the poles,
resulting in extremely inefficient triangulations:
Earth RSE uses
a pole-free coordinate system. The north pole in this image (the
vertex at the top-left of the higher-detail chunk) is shown to behave
like any other vertex. Chunks stay approximately square
and triangles right over the entire surface of the planet. Also
note the coastline tesselation:
Groundcover
Standard RGB
groundcover may be sampled into a single mosaic in the same way that
multiple topography datasets are sampled into a single database.
All other groundcover sets are sampled separately, and grouped and
compressed by families of spectral bands. Groundcover (and any
other raster output) is stored in a chunk hierarchy database.
Metadata on each chunk describes a set of codestreams, each with an ID,
size, numeric type (e.g. uint8, int16, float, etc.), and compression
scheme (e.g. zlib, Jpeg2000, Jpeg2000 with PCA, etc.). Each
codestream includes a set of output components with individual IDs but
inheriting their other attributes from the codestream. This
simple scheme allows a multispectral image and related rasters (perhaps
a bit array for availability and a uint8 array for opacity) to be
compressed separately yet serialized together.
The multispectral JPL Landsat7 distribution used in this demo
includes a half-arcsecond panchromatic band
and six one-arcsecond color
bands. In a real application, the six color bands should be
panchromatically sharpened into a half-arcsecond mosaic, then sampled
together into the system. In this demo application, to better
demonstrate how flexibly multiple raster sets can
be integrated and manipulated at runtime, the panchromatic sharpening
is applied in the pixel shader, so separate pan and color Landsat
databases are constructed.
The panchromatic band is sampled and
compressed using a Jpeg2000 codec (the same
compression used in the popular MrSID format). The
six color bands are sampled and analyzed to produce a principal
component transform. This
color-space rotation is then factored into a product of single-row elementary
row-operation matrices (SERMs) to allow
optional lossless compression. The six color bands are pushed
through the forward transform, and the output channels are Jpeg2000
compressed. On sophisticated encoders the post-compression
rate-distortion algorithm increases
compression performance when the principal components are encoded
together rather than separately.
On the runtime
side of the demo, the six Landsat7 color bands are decompressed and
pushed through the backward principal component transform, then
interleaved into two Direct3D textures (one A8R8G8B8 and one A8L8
texture). In a Shader Model 2.0 pixel shader the six texture
channels are loaded and pushed through a linear transformation (a 6x3
matrix multiplication and a 3-vector addition) and output into the
color output register. In the demo, the color transform is
controlled by a bank of 21 floating point slider controls, which are
queried every frame and packed into vertex shader constants. When
performing runtime pansharpening, the fast IHS fusion method, being
computationally simple enough to perform on every pixel in every frame,
is used on the three output channels of the shader color transform.
Non-linear
multispectral effects are also easy. For example, remote sensing
researchers have fashioned vegetation
index functions by observing
spectral characteristics of chlorophyll. These functions, which
are often normalized to the range -1 to 1, give an estimate of plant
density in a pixel of groundcover. The vegetation index may be
clamped to a non-negative range and set as the saturation of an HSV transform. The hue is
set to green in the Earth RSE demo (although set to red in many other
geoscience applications) and the value is taken from a raster that
gives context to the frame, such as the panchromatic or green color
band.
Both images
below show the same part of California's Central Valley. The
image on the left uses a Landsat 4-5-3 band combination which tends to
color irrigated fields in gold. The image on the right uses the
Enhanced Vegetation Index to show plant density, with the panchromatic
band providing the value of the HSV transform:
The flexible
texture formats supported by the groundcover metadata directory, the
utility of the Earth RSE dependency manager and chunk selection
algorithm, and the growing power of consumer video cards mean that the
biggest challenge in pushing realtime analysis is the limit of
imagination rather than technology. All current consumer video
cards support the full set of elementary
operations,
and dependent lookups in the shader allow for fast searches in textures
that have been organized like binary trees. The shader used to
render the Enhanced Vegetation Index image above can even be extended
to classify any (small) number of substances, each with its own index
function. By associating hues, one for each substance, that are
equally spaced around the color wheel, multiple index functions may be
mapped against the same neutral background and yet be mutually
distinguishable.
Topography
Simplification
Topography is
simplified into a right-triangle irregular network by the
preprocessor. Heights of vertices in the triangulation are
quantized and serialized, along with a array of roughly one bit per
triangle. This bitarray gives the decoder sufficient context to
reconstruct vertex connectivity, build a triangle list, and interpolate
the unit vectors for geocentric referencing. A "maximum error"
metric is used to evaluate the error caused by removing each
vertex. The simplification code produces much higher quality
meshes than continuous LOD algorithms while only serializing the
vertices that actually are stored in the triangulation.
On the client
side, when each topography chunk is deserialized and decoded into
runtime geometry, each mesh is divided into small sections over which
the horizontal and vertical components of triangle surface areas are
summed. When the resource hierarchies are recursed each frame to
select data to render, these topographic surface areas are evaluated
with the camera position and direction to get an anisotropic projected
error. This special error value prevents the glut of data at the
horizon caused by the use of an isotropic error (error based only on
geometry's distance from the camera) as seen in MS Virtual Earth 3D and
NASA World Wind. Our anisotropic metric also provides more
consistent detail over the surface of the image, even to the edge of
the horizon, than that used by Google Earth. Consider the level
of detail of the notches in the hills in this image, both in the
camera-facing side as well as in the valley that cuts away from the
viewer on the left:
The Point System
Multiple
point-object datasets may also be recursed along with topography and
groundcover data during rendering. Earth RSE includes a
high-performance point system which processes lists of events into
spatial databases. Each event must minimally include latitude,
longitude, and a magnitude. Events are sorted by magnitude, which
may be something like population or earthquake moment, then processed
into a quadtree, using a space-filling curve (similar to the Z-order curve)
on the event coordinates, with each quadtree node containing a few
dozen events. The magnitude of an event in a node is greater than
the magnitudes of all the events in nodes below it. If a
point-object database has information that should be displayed in
response to a mouse over or other event, a second database table can be
built with one row per event, and keys stored in the primary database
nodes.
At runtime, points are factored into view-dependent local coordinate
systems. Event magnitudes are pushed through a view-dependent
dataset-defined function to produce bounding sphere radii. These
radii are projected into screen space and compared against an ascending
sequence of LOD pixel tolerances. After rendering of the
terrain has completed, events are grouped by dataset, local coordinate
system, and LOD, and sent to a batch render function. The
demonstration datasets supported by Earth RSE use vertex constant
geometry instancing to quickly render thousands of small meshes.
To demonstrate the point system, Earth RSE has support for a cities
database and the Global CMT database. MaxMind has made available
a CSV
file with 2.8 million cities, several tens of thousands with
population numbers. After filtering out rows with alternative
spellings, over 1.8 million cities remain. Global CMT is a
database of centroid moment solutions for all large shallow earthquakes
to hit since 1976. The distribution
currently has 29,542 events. Focal mechanisms are rendered as
four-section "beachballs." The orientation of the sections
correspond to the nodal planes of the solution. The beachballs
are colored according to the depth of the event, with shallow events in
orange and increasing in hue (in HSV) to deep events in red. A
mouse over event displays the complete CMT solution along with the
lower-hemisphere orthometric projection of the focal mechanism that
geologists most commonly work with. A rotation quaternion is
computed during preprocessing on the Global CMT event to orient the
focal mechanism in global space. At runtime, whenever an event is
moved to a new view-dependent local coordinate system, this quaternion
is concatenated with a second rotation to orient the mesh in the local
coordinate space.
Optimization
The entire
source base is structured so that all inner-loop floating point and
fixed point operations are evaluated using SSE/SSE2 packed
instructions. Data types have been deinterleaved into structures
of arrays to
allow for SIMD processing without shuffling or fetching unused
data. A very extensive SSE/SSE2 library built using SIMD
intrinsics, C++ templates, and operator overloading provides for very
optimized operations using a syntax similar to ordinary C.
Included in the library of specially optimized functions are three
different datum shifts; bicubic interpolation for float32 and
fixed-point int16, 49- and 64-tap Mitchell filters for producing
multi-resolution pyramids of topography and groundcover; geometric
functions to intersect primitives with volumes and surfaces; functions
to map queries to raster sections, points, and uv offsets; the function
that computes a covariance matrix for PCA; and the forward and backward
principal component transforms for image compression.
Why remote sensing and why now?
Consider the
five most conspicuous advances in consumer electronics in the past
decade:
- Digital photography -
Consumer digital cameras have driven the development of high
resolution, fast response, low noise photosensors.
- Wireless communication -
Everything from phones and PDAs to Wifi computer networks to all kinds
of IrDA and Bluetooth gadgets to a fleet of geostationary communication
satellites.
- Electronic storage - The
most miniature solid state devices to inexpensive 1TB 3.5" magnetic
disks. Density increases rapidly while power use drops.
- CPU speeds - As CPU clock
speeds are beginning to plateau, processors are keeping up with Moore's
Law by going multi-core. Dual-core x86 chips released in summer
of 2005, and quad-core x86 chips to be released summer of 2007.
Rate of core doublings is expected to accelerate into the future.
- Graphics processing -
Introduced in the 90s, consumer 3D graphics daughterboards simply
rendered texture-mapped triangles for games. Current generation
graphics processing units (GPUs) are full co-processors, and, thanks to
the standard shader
models,
support very flexible operations on a variety of texture formats,
including single-precision float. The highly parallel nature of
graphics processing has allowed GPU speeds to increase even faster than
CPU speeds.
A realtime 3D
remote sensing application built using the Earth RSE engine, while
not necessarily targeting the same broad audience as most of the
consumer electronics that drove these advances, has been enabled by
them.
Earth-observing
satellites (and here)
use sensors similar to those found in professional digital
cameras. Advances in camera sensors trickle up to the domain of
remote sensing, allowing higher resolution scenes with wider
coverage. Hyperspectral
satellites may capture several hundred spectral bands (e.g. EO-1
Hyperion captures 220 unique bands). While consumer cameras
pack only three
colors into sensor arrays, hyperspectral satellites distribute
hundreds of colors over just a few arrays. The resolution of
commercial camera sensors will soon hit a practical limit (who needs
50+ Megapixel cameras?), but because the effective resolution of a
hyperspectral camera is the sensor resolution divided by the number of
spectral bands, hyperspectral instruments can increase the number of
spectral bands to exploit sensor advances far into the future.
Sending the vast amounts of data collected by hyperspectral imaging
satellites is challenging. New transmission technologies in the
X-Band allow much more bandwidth from space to Earth. Even more
importantly, continuously improving processing power will allow
hyperspectral satellites to apply ever more aggressive signal decorrelation
and compression.
The Earth RSE demo demonstrates a high-performance principal
component transform on six channels of data. While this is the
optimal decorellating transform, its running time unfortunately scales
quadratically with the number of inputs. A hyperspectral dataset
with 220 bands has only 36 times as much data as the six-band Landsat
set of the same resolution, yet the same component transform would take
1344 times longer. On the other hand, this transform does yield
output that compresses with dramatically increasing performance as the
number of spectral bands increases. Faster processing allows
satellites to choose more optimal decorrelating transforms and
compression routines to send more bands over more wireless
bandwidth. The EO-1 satellite, launched in 2000, only has the
technology to send about twelve hyperspectral Hyperion scenes a day,
making significant coverage of the planet impossible. Satellites
launched at the end of this decade will have benefited from advances
largely driven by consumer electronics to capture and transmit imagery
of better quality over a much larger domain. And data storage
farms, developed to cache exabytes of internet miscellany, can easily
store the massive daily transfers from earth-observing satellites.
Fast graphics cards and multi-core CPUs allow rendering and analysis of
global multi-band datasets streamed over residential broadband.
Hyperspectral data may be factored and organized into spectral
hierarchies similar to the multi-resolution spatial hierarchies
navigated in 3D browsing. Consider the use of a continunous
function (possibly even piecewise), such as a probability
distribution, over the spectral domain of a hyperspectral
dataset. Three of these functions, each integrating to one (or to
the gain of the mapping), define a mapping from the reflectance spectra
of the groundcover to the RGB output of the computer screen. This
operation may be considered against the absorption
spectra of the photopigments
found in human cone
cells, which map from the visible spectrum to the psychological
sensations of seeing red, green, and blue. Realtime interactive
manipulation of the dataset mapping curves simulate changing the
response spectra of our own cone cells. For example, sliding the
red mapping curve into the near infrared spectrum allows the user a
visual experience foreign to humans but familiar to many animals
outside the class mammalia. By treating the dataset mapping as a
continuous function rather than a linear transformation matrix (with
one column per band and three rows for RGB output), the software can
adaptively stream, decompress, and transform the minimum number of
factored components of the groundcover spectral hierarchy which, when
blended, reconstruct the requested signal to within some
tolerance. The fact that there are only 256 intensities of red,
green, and blue output to a computer screen establishes a lower-bounds
error on rendering applications. Just as a small fraction of the
data in a multi-resolution pyramid need be accessed to accurately
render terrain from a single viewpoint, only a fraction of the
components in a spectral hierarchy contribute to blending hyperspectral
bands from a single mapping function. This hyperspectral
application is theoretically straight-forward but requires CPU power
and graphics fill-rate that has not until the present been available.
A slightly more sophisticated hyperspectral application is the use of spectral
angles or Pearsonian
Correlation to classify groundcover against known reflectance
spectra. Every substance has a reflectance spectrum, a unique
identifier that may serve as a quantum mechanical fingerprint.
Chemical libraries have been developed in which
substances are analyzed under spectrometer and their reflectance
spectra recorded. The reflectance spectrum for
coarse ground quartz can be matched against a hyperspectral dataset
and a correlation produced. If the correlation on a pixel is
high, that pixel is set to a color that represents coarse ground
quartz. This is a more sophisticated and quantitative use of
classification than the computation of vegetation indices supported in
the demo. Any number of substances can be classified and noted on
an image by selecting hues that are equally spaced around the color
wheel. Because the classification is automatic, this sort of
analysis is best performed (at least in part) on the CPU.
Many other valuable
analyses can be performed on hyperspectral or environmental data,
and with the commercially-driven advances in technology they should be
done in realtime in an interactive, 3D program powered by an engine
like Earth RSE. The engine
does not yet support hyperspectral datasets (as defined by having too
many spectral bands to fit in all the components in all the pixel
shader samplers), but it does provide a good platform for developing
spectral factoring and blending code, using the current multispectral
support as a model.
Datasets
seen in the demo:
NASA SRTM 1-arcsecond topography
- United States
CGIAR SRTM 3-arcsecond topography
- Global between 60° N and 56° S
USGS
GTOPO30 topography - Global 30-arcsecond
EGM96 reference
geoid - 15-arcminute grid
JPL Onearth Landsat 7 mosaic
- Global. Six color bands at 1-arcsecond, one panchromatic band
at half-arcsecond.
NASA
Blue Marble Next Generation mosaic - Global. Twelve monthly
mosaics for 2004 at 15-arcsecond.
MaxMind
Cities Database - 1.8 million cities CSV with country,
region/state, and population.
Global CMT -
29,000 earthquake CMT solutions. Jan 1976 - current.