Sunday, April 27, 2014

Move Over #BigData, Make Room for #DataSecurity


According to the U.S. Bureau of Labor Statistics, computer and mathematical occupations is expected to have a job growth of 18% over the next 10 years. At the core of many businesses is data, that requires organization, security, mining and analysis. As such, computing disciplines (information technology, computer science and business) include as part of their core curriculum at least one database management systems course. Every spring semester, I teach an advanced database design techniques and physical issues relating to enterprise-wide data management using the “Modern Database Management” by Hoffer et al. as the required course textbook. The course has 3 main foci: 
  • Modeling, e.g., entity-relationship diagram, enhanced entity-relationship diagram
  • Implementation, e.g., logical and physical database design, database querying 
  • Operating logistics e.g., data stewardship, data and database administration
With the onslaught of #BigData, #DataAnalytics, #DataMining and #DataScience mentioned in nearly every computing-related article and conversation, the students (and everyone else) want to know what it is and how it impacts businesses. The Computing Research Association (CRA) Big Data Whitepaper provides a great showcase of the #BigData challenges and opportunities.  The figure below displays the data processing stages (top row) and the interjecting wildcard features (bottom row).
CRA's Big Data Pipeline (http://www.cra.org/ccc/files/docs/init/bigdatawhitepaper.pdf)
#BigData discussions concentrate on the 3Vs (Volume+Variety+Velocity, circa 2001, Gartner Inc.), 4Vs (3Vs+Viability/Veracity, circa 2012) or, now 5Vs (4Vs+Value, circa late 2012/early 2013). So, the heterogeneity, scale, timeliness and human collaboration features from the above figure are covered, but where does that leave privacy? The data conversation must do a better integration of data security/privacy needs and challenges. The emergence of data-centric specializations and degree programs in data analytics, data mining and data science is fueled by the increasing need to train undergraduates and graduate students to be prepared to handle actual businesses data needs. To expose undergraduates to data and database security, an advanced database design course should augment the operating logistics course topic and inject database security overview, granular access control, securing database-to-database communications, and multi-level security in database systems.

Graduate student awareness and training in data and database security/privacy must be tightly coupled education with applied research. Toward this effort, NSF has sponsored the Information Security Research and Education (INSuRE) Collaborative project. INSuRE is to establish a long-lasting Centers of Academic Excellence in Information Assurance Research (CAE-R) and government coalition in cybersecurity research. This initial partnership includes four successful and mature CAE-Rs and the National Security Agency (NSA) in order to design, develop and test the research network. INSuRE will be a self-organizing, cooperative, multi-disciplinary, multi-institutional, and multi-level research collaborative that can work on both unclassified and classified research problems in the information security domain.

Other domains have data and database security/privacy considerations. Here's just a few:
  • Data Security in Transportation: digital cities & smart(@pietromax, @UCLALewisCenter), ride sharing (@uber, @lyft)
  • Data Security in Aviation: Timely Data Acquisition for the Aviation Industry
  • Data Security in Health: #HCLDR twitter discussions Tues @ 8:30PM, #BlerdChat & #HCHLITSS twitter discussions Thurs. @ 7PM & 8PM, respectively

No comments:

Post a Comment