Dr. Peter Bijcsy from NIST gave a talk entitled “To Measure or Not To Measure Terabyte-Sized Images” today in UMD CS Department.

Abstract and Bio

This talk will elaborate on a basic question “To Measure or Not To Measure Terabyte-Sized Images?” posed by William Shakespeare if he were a bench scientist at NIST. This basic question is a dilemma for many traditional scientists that operate imaging instruments capable of acquiring very large quantities of images.  However, manual analyses of terabyte-sized images and insufficient software and computational hardware resources prevent scientists from making new discoveries, increasing statistical confidence of data-driven conclusions, and improving reproducibility of reported results.
The motivation for our work comes from experimental systems for imaging and analyzing human pluripotent stem cell cultures at the spatial and temporal coverages that lead to terabyte-sized image data. The objective of such an unprecedented cell study is to characterize specimens at high statistical significance in order to guide a repeatable growth of high quality stem cell colonies. To pursue this objective, multiple computer and computational science problems have to be overcome including image correction (flat-field, dark current and background), stitching, segmentation, tracking, re-projection, feature extraction, data-driven modeling and then representation of large images for interactive visualization and measurements in a web browser.
I will outline and demonstrate web-based solutions deployed at NIST that have enabled new insights in cell biology using TB-sized images. Interactive access to about 3TB of image and image feature data is available at https://isg.nist.gov/deepzoomweb/.

Bio: Peter Bajcsy received his Ph.D. in Electrical and Computer Engineering in 1997 from the University of Illinois at Urbana-Champaign (UIUC) and a M.S. in Electrical and Computer Engineering in 1994 from the University of Pennsylvania (UPENN).  He worked for machine vision, government contracting, and research and educational institutions before joining the National Institute of Standards and Technology (NIST) in 2011. At NIST, he has been leading a project focusing on the application of computational science in biological metrology, and specifically stem cell characterization at very large scales. Peter’s area of research is large-scale image-based analyses and syntheses using mathematical, statistical and computational models while leveraging computer science fields such as image processing, machine learning, computer vision, and pattern recognition. He has co-authored more than more than 27 journal papers and eight books or book chapters, and close to 100 conference papers.

Links

Questions

Thank you Dr. Bajcsy for your great talk and live demo. It’s very impressive. I have a question about rendering. I saw when recording the movie frame by frame, the image is not rendered in real-time. So I am wondering which factor limits the rendering rate? What is greatest challenge for rendering terabytes image?

Answer: Bandwidth!

Messy Note

Diabetes, heart disease, musculoskeletal disorders, or age-related macular degeneration are amongst today’s concerns.

To measure or not to measure the entire sample is a problem.

However, there are three fundamental problems: scale, complexity and speed.

Scale: Imaging Tsunami

  • mXRF and XRD 1TB of data in 16 years
  • now 1TB of data in 3 min

Complexity: Image Models.

Given 2TB of acquired image data in 2 minutes.

Move 2TB data from microscope to computer -> 66min over 1 Gbit/s bandwidth

Fundamental Problem: transform from images to scientific insights.

  • Scale: Nano to centimeter physical scale; TB- to PB-sized digital datasets
  • Complexity: Many instruments, Sample variety and Many models
  • Speed: Validate models, Explore and Discover

Potential Applications:

  • Astronomy
  • Chemistry
  • Medicine
  • Fire Research
  • Physics
  • Biology
  • Materials Science
  • Forensic Science

Case Study

Age-Related Macular Degeneration (AMD). 11 million affected people in the US. Leading cause of vision loss in adults. Estimates of the global cost is $343 billion, including $255 billion in direct health care costs.

Stem cell engineering of retina is needed…

Safety of Carbon Nanotubes.

54 laboratory animal studies.

Carbon nanotubes can cause adverse pulmonary effects including inflammation, granulomas.

Challenges:

Scale: Nano to centimeter physical scale, TB- to PB-sized digital datasets

Challenge: Data spraying, fast processing, limited transfer.

Complexity: Many instruments, sample variety, many models.

Challenge: Multi-modal image fusion, image object characterization & modeling

Speed: Validation of models, exploration, discovery; Challenge: Comparison across models, rendering images, search over image feature space.

 

Astronomy: Sloan digital sky survey

Medicine: 2D histology slide

Earth Science: GIS visualization

 

Existing solutions are highly application specific and impractical to be adopted by new applications.

Current limits in Bio and Material Sciences

  • data reside on hard drives, large sample variety and many imaging modalities
  • No trusted tools to collect measurement.

 

Approach: Scientist’s Perspective

 

Sample, one small field of view = ~mega-pixel image

Create one large field of view per imaging modality ~ hundred giga-pixel image

 

Fusion + rapid analyses <=> auto analyses

Metrology Perspective

 

Parameters of Data Acquisition and many small image fields of view

Calibration and image fusion

 

hierarchical partitioning

phenomenon models at each level

K = \frac{1}{2}(mv^2)

 

Parametrized Models and algorithms

rapid and collaborative tools

parametrization, display, validation and optimization

 

 

How to do Hierarchical Partitioning

5mm ~115K pixels -> 1366×768 pixels

 

 

Herbert Simon, Nobel prize in Economic 1978 said that

Complexity frequently takes the form of hierarchy.

Scale approach: logical partitioning, parallel algorithms, selective transfer

Complexity: experimental design and fusion algorithm, object modeling

Speed: model hierarchies, local processing, collaborative measurements.

 

Approach: Building Blocks

Transfer many small FoV to one large FoV.

Corrections, Stitching, Re-Projection, Segmentation, Tracking, Feature Extraction, Prediction Modeling

Transform one large field of view for rapid analyses.

2.8MB/image = 0.677TB (17% sampled)

 

 

Image understanding

Accuracy, Uncertainty, Robustness and Sensitivity

Scalability of image computations

 

 

The answer is to measure the entire sample