Big Data and its Role in Driving Technology Trends – Intel’s Data Center CTO and Senior Fellow Steve Pawlowski Shares His Insights
How big is big data? Imagine a petabyte of data…which in video form represents about 10 years of HD viewing. Scientists, corporations and people are producing petabytes of data an amazing rate….an average Chinese city’s traffic cameras produce a petabyte every 12 hours or so, and the CERN supercollider produces a petabyte per second. These massive stores of data represent an enormous opportunity for analytics, but in order to scale technology to process data this size and scale represents an enormous technology challenge for the industry.This is probably why Intel Senior Fellow Steve Pawlowski’s session at the Intel Developer Forum was a standing room only affair. While everyone talks big data today, no one has the complete map to where we need to go to fully unlock the benefits of big data. Steve broke this problem down with a focus on data center innovation, compute and storage innovation, data protection and context and location as areas for industry innovation focus to harness the full value of big data.In order to scale data center capability, the efficiency of data generation and compute at a data center level need to be addressed.Steve discussed Rack Scale Architecture as well as data center level innovations including ambient cooling, DC power, optimized PUE to help address this challenge. He also pointed to better management of infrastructure power instrumentation with technologies like Intel Data Center Manager as being focuses of Intel as driving the power dynamics across the data center. He then drilled into system efficiency discussing platform technologies including improved fan speeds, high efficiency power supplies and voltage regulators, and liquid cooling as technologies to improve server level efficiency, and processor advancements can push this efficiency further.Steve then changed focus to discuss the importance of efficient scaling of memory capacity for big data. While today’s systems scale memory through increased DRAM, this scaling is expensive both from a cost and a power consumption perspective. Steve pointed to new memory technologies on the horizon that would introduce the performance capabilities of DRAM with power and cost dynamics of NAND memory as being an emerging alternative for the data center.With a path defined towards memory innovation, Steve then moved on to the I/O and interconnect innovation required to move all of this data from data collection to data analysis. His first topic in this field was the breakthrough represented by silicon photonics, a new capability invented by Intel to use silicon lasers to dramatically reduce the cost of optical transmissions. He then discussed the growing challenge in wireless communication with an emerging spectrum shortage stating that by 2020 we will only be able to meet 50% of the demand for spectrum usage. Steve pointed to delivery of software defined radios, devices that could identify available spectrum and modulate point to point communications to available spectrum on demand as a potential solution to our spectrum challenge, but admitted that regulatory control of spectrum must be addressed as well to put this solution into practice.Steve followed his focus on moving data to the challenge of security discussing ensuring data integrity within the data center and as data is in flight. While current solutions try to keep pace with today’s environments, the news is rife with stories of data exposure often for nefarious purposes. The industry needs to do more to ensure data integrity, and Steve called for a holistic approach that focuses on defending against firmware attacks, protecting privacy and data security, ensuring application security, and overcoming programming errors and developing applications for failure.Finally, Steve focused on the analytics frameworks themselves highlighting the early progress of Hadoop and pointing to the need for the industry to develop many algorithms for analytics. While the parallel batch work of frameworks like Hadoop will get us started, some of the most compelling uses of big data such as genomics or social searching involve graph approaches. Steve provided an example of investment by Intel’s research team in development of genomics algorithms speeding results 10X and stated that in the future compute will be driven by the requirements of these algorithms vs. forcing algorithm development towards a static compute platform.