SCUP 2016 Mid-Atlantic Symposium: Big Data: Academy Research, Facilities, and Infrastructure Implications and Opportunities. John Hopkins, May 13, 2016
Report
Share
Report
Share
1 of 30
More Related Content
Understanding the Big Data Enterprise
1. Understanding the Big Data
Enterprise
Philip E. Bourne, PhD, FACMI
Associate Director for Data Science
https://datascience.nih.gov/
philip.bourne@nih.gov
2. My Bias
• University professor - 30+ years
• Associate Vice Chancellor for Innovation – 2
years
• Maintainer of public data resources (PDB etc.
– 15 years)
• Open science advocate – 10+ years
• Fed – 2 years and counting
3. None of what I am about to tell you
negates what you have heard thus far
today…
Much of what you have heard is
prerequisite to my 30,000 foot view
4. My Definition of Big Data
• More than the 4+ “V’s”
• A signal of the coming digital economy
• An economy characterized by using data to
gain a business advantage (and yes
universities are a business)
5. What is the Worse that Can Happen?
Digitization
Deception
Disruption
Demonetization
Dematerialization
Democratization
Time
Volume,Velocity,Variety
Digital camera invented by
Kodak but shelved
Megapixels & quality improve slowly;
Kodak slow to react
Film market collapses;
Kodak goes bankrupt
Phones replace
cameras
Instagram,
Flickr become the
value proposition
Digital media becomes bona fide
form of communication
[Steven Kotler]
http://bigthink.com/think-tank/steven-kotlers-six-ds-of-exponential-entrepreneurship
6. Enterprises that are not born digital
are at a disadvantage in this new
economy…
Fortunately no university has yet to be
born digital …
The “Google university” could change
that
7. The Writing is on the Wall
(Personal Experiences)
• The story of Meredith
• Increasing number of undergraduates as first
authors on my papers
• Talking head lectures
• Growing frustration at lack of entrepreneurial
support
• The Google bus
8. The Writing is on the Wall
(Institutional)
• Changing access models
• Changing funding models
– Less federal and state funds
– More sponsored research
– Increased tuition
– More reliance on philanthropy
• Changing pedagogy
– MOOCs, SPOCs, DOCCs, flips
• Changing student expectations
– Expect to be taught in a different way
• Changing faculty expectations
– Expect more from the institution
• Changing staff expectations
– Better recognition
• Changing employer expectations
http://collegeparents.org/2011/01/26/when-your-college-student-unhappy/
Yet demand for a quality higher education has never been higher
9. Leads to the Notion of the University
as a Digital Enterprise
• The university is defined by its digital assets:
– On-line course materials
– All of the research life cycle on-line: grants, data,
computational methods, results, conclusions,
publications
– Faculty, staff and student profiles on-line
– All administrative data on-line e.g. grants, policies
and procedures, disclosures, contracts, patents,
agreements, payroll, academic files
10. The Most Successful Universities of the
Future Will be Those That Can Best
Leverage Their Digital Assets – How?
11. “Life Wasn’t Meant to be Easy”
Malcolm Fraser
Former Prime Minister of Australia
12. How? - Break Down the Silos
Research
Basic Clinical
Education Administration
13. How? - An Appropriate Organizational
Structure
Chancellor
CIO /CDO
Research
Services
Education
Services
Admin
Services
Medical
Services
Library
15. Research Data
• Prof x drags and drops her research data to the
institutional dropbox. She is asked for a small
amount of metadata describing the dataset. Part
of that request gives permission for the data to
be indexed and the index analyzed by the
University. That analysis reveals that two other
researchers have worked on the same gene in the
past two months and they are all alerted as to
their common interest and begin collaborating.
.
16. Faculty Productivity
• From a single profile a faculty member can, at
the push of a button, generate a world-facing
current web presence, provide biosketches to
the major funding agencies and submit their
academic file for review saving countless
hours of reformatting which now goes into
productive research.
17. The Education – Research Interface
• The UCSD on-line drug commercialization
course which previously had 40 local students
now has 12,000 several of whom apply to Dr.
Bourne’s lab as PhD students based on the
material he presented. The course also
highlights UCSD’s leadership role and by
navigating the on-line curriculum several
students apply to UCSD as undergraduates.
One high school student applies to Dr.
Bourne’s lab as a summer intern.
18. The Research-Administration Interface
• Researcher x receives a new grant, researchers y
and z are notified since it is very close to areas in
which they work and points of collaboration may
be possible.
• Researcher x needs to have an assay performed
and can immediately locate who on campus and
off-campus can perform the work and at what
cost.
• Experts on and off campus can immediately be
identified for the review of a potential patent
filing based on a researcher’s technology.
19. Talk is cheap – What is NIH doing to
address a similar situation?
20. NIH By Comparison
• 27 silos
• Clinical and basic research
• Intramural + extramural
• Administration
• Education role different
https://en.wikipedia.org/wiki/Victory_Soya_Mills_Silos
21. Established a Commons
• Supports a digital biomedical ecosystem
• Treats products of research – data, software, methods, papers
etc. as digital research objects
• Digital research objects exist in a shared virtual space
• Digital objects need to conform to FAIR principles:
– Findable
– Accessible (and usable)
– Interoperable
– Reusable
22. Commons Framework Pilots (CFPs)
• Exploring feasibility of the Commons framework
• Facilitating connectivity, interoperability and
access to digital objects
• Providing digital research objects to populate the
Commons
• Enable biomedical science to happen more easily
and robustly
23. BD2K Centers, MODS
and HMP
Compute Platform: Cloud or HPC
Services: APIs, Containers, Indexing,
Software: Services & Tools
scientific analysis tools/workflows
Data
“Reference” Data Sets
User defined data
DigitalObjectCompliance
App store/User Interface
Mapping Commons PILOTS to the
Commons Framework
PaaS
SaaS
BD2K Indexing
BioCADDIE,
Other, schema.org
IaaS
[Vivien Bonazzi]
24. Compute Platform: Cloud or HPC
Services: APIs, Containers, Indexing,
Software: Services & Tools
scientific analysis tools/workflows
Data
“Reference” Data Sets
User defined data
DigitalObjectCompliance
App store/User Interface
Mapping Commons PILOTS to the
Commons Framework
PaaS
SaaS
Cloud credits model
(CCM)
IaaS
25. Commons Credits Model
The Commons
(infrastructure)
Cloud Provider
A
Cloud Provider
B
Cloud Provider
C
Provides credits Enables Search
Uses credits in
the Commons
IndexesOption:
Direct Funding
NIH
Investigator
bioCADDIE
[George Komatsoulis]
27. How to Change the Culture?
• Intramural and extramural training programs
• Fostering open science
– e.g. policies, challenges
• Fostering changes to the research life cycle
– e.g. preprints, data citation, open final reports
• Strategic planning with buy-in from major
stakeholders
• Use cases as exemplars
29. Some Thoughts as to Why I am Not
Crazy
• A platform to exchange goods – researchers
produce and consume reagents, data,
knowledge etc.
• A platform built on trust – trust is a key part of
the academic enterprise
• A platform provides a sustainable business
model
Sangeet Paul Choudary
http://www.wired.com/insights/2013/10/why-business-models-fail-pipes-vs-platforms/
30. Summary
It was the best of times, it was the worst of
times, it was the age of wisdom, it was the
age of foolishness, it was the epoch of
belief, it was the epoch of incredulity, it
was the season of Light, it was the season
of Darkness, it was the spring of hope, it
was the winter of despair…
Charles Dickens