My prior collaborator, Lee Stearns, presented his dissertation talk today.
HandSight: A Touch-Based Wearable System to Increase Information Accessibility for People with Visual Impairments
Please refer to http://www.leestearns.com for more details. An incomplete outline is summarized below:
Related Work
- OrCam, Access Lens, Access Lens, OmniTouch, VizLens (UIST 2016), ForeSee, Google Glass, eSight, NuEyes, IrisVisio
Reading / Exploring Text
- Advantages of Touch-Based Reading
- Does not require framing an overhead camera
- Allows direct access to spatial information
- Provides better control over pace and rereading
- New Challenges
- How to precisely trace a line of text?
- How to support physical navigation?
HandSight HoloLens version
- Augment rather than replace
- Camera resolution too low
- Turning head to look at desired content was uncomforable
- Voice commands are cumbersome and imprecise
- Fixed 2D
- Screen billboards
- Fixed 3D
- Vertical and horizontal mode
- Finger tracking design
- Follow where the finger is pointing at in 3D
- Three participants
Findings
- Finger-worn camera
- [+] flexible, allows hands-free use
- [-] Requires moving finger to read
- HoloLens for low-vision
- Low contrast due to transparency
- Narrow view, center of vision
Handheld Camera / Mobile Phone
- 6 low vision participants
- participants were more successful and positive about their experience
- [+] Better camera
- [+] More useble interactions
- [-] No longer hands-free
Conclusion / Strengths and WEaknesses of 3D AR
- [+] Enables new interactions not possible with other approaches
- [+] Good for multitasking
Design space exploration: AR magnification & enhancement
Implementation and evaluation
On-body Input using finger-worn sensors
- Preprocessing
- Coarse-Grained Classification
- Textures
- Feature Extraction
- Localization (SVM)
- Accelerometer
- Gyroscope
- Magnetometer
- Fine-Grained Classification
- Geometric Verification
Within-person classification Experiment
- Coarse-Grained 99.1%, 88.0%, 96.4
Interface Designs
- Real-time processing (~60 fps)
- five applications, clock, health & acitivities, clock, daily summary, notifications, summary, location-specific gestures on body
- 12 Visually Impaired participants
- 5 participants location-independent gestures
- 6 participants love the location-specific gestures on the palm
- 1 participant love location-specific gestures on the body
- Mitigate camera framing issues
- Demonstrate feasibility, with high accuracy and approaches
Color Recognition
- Limitations: cannot recognize patters, only color
- Do not allow users to quickly inspect multiple locations
- Accuracy affected by ambient lighting and distance.
- Both colors and visual patterns
- Deep convolusional activation features (DeCAF)
- Dense SIFT features combined in an Improved Fisher Vector (IFV)
- Highly controlled dataset – risks overfitting, limits robustness
- Solid, striped, checkered, dotted, zigzag, textured, rotation (30\degree increments), scales (1-4)
- An End-to-End Deep Learning Approach
- Fine-tuning the classifier with ~half of the HandSight images, (N=36 per classs) increases the accuracy to 96.5%
- Identify multiple colors in a single image
- User-configurable level of detail:
- Two datasets of fabric pattern images
- 529 images from HandSight
- 77,052 external
Future Work
- Alternative or supplementary camera locations
- Camera on the user’s finger or wrist
- Camera on the User’s Upper Body
- Wider field of view more contextual information
- Easier to localize and track hand/finger position
- Spatial exploration of documents and other surfaces
- Maps, charts and graphs are hard to explain (A very interesting deep learning topic)
- Translating to other languages