T-SOUL

Vol.20 Artificial Intelligence Discovering Values of IoT Data -Analytics towards Deep Learning-

Print

#03 Starting Verification at Smart Community Center New Technical Approach of Parallel Distributed Learning to Practically Apply Deep Learning to IoT Toshiya Takano Chief Specialist, Deep Learning Technology Department IoT Technology Center Industrial ICT Solutions Company, Toshiba Corporation

Deep learning is used in analyzing very large volumes of data collected through IoT (Internet of Things).
To use deep learning effectively, creating learning models from data with short time and high accuracy, providing an analytical environment tailored to customer situations and feeding back such learning models to fields are necessary. One problem with the conventional parallel distributed learning technology is that the more calculation nodes are installed to handle large size of data, the calculation efficiency becomes lower. Toshiba started to develop a high-speed parallel distributed learning technology combining its experience in dealing with image and speech recognition with high-speed storage technology of Dell EMC, a Dell Technologies Group company. A high-speed deep learning testbed based on this new technology was approved by the Industrial Internet Consortium (IIC). Toshiba will verify deep learning applications for IoT at Toshiba smart community center using the testbed.

Conventional Parallel Distributed learning Technology is Facing Its Limits

Deep learning demonstrates significantly higher accuracy than that achieved by conventional analytical techniques and inferences. Deep learning will be useful in the utilization of IoT data in the industrial region and is expected to produce quick judgments and new knowledge in controlling manufacturing lines, in stable operations of social infrastructure, in optimization of building facility operations and for other purposes. A high learning capability of the deep learning, however, requires significantly large calculation loads to computers in its learning process. The utilization of deep learning in the industrial region will become unrealistic if a long time is required to create learning models from diverse data in time series that is sent from sensors and cameras continuously and to produce excellent insights. For example, if it takes long time to detect change factors and to determine the countermeasures from a very large amount of sensor data, the situation may change and the countermeasure may become ineffective by the time analysis is made. As a result, time, money and confidence may be lost. For practical use of deep learning in the industrial region which treats large volumes of data, it is required to realize to make learning models with short time and high accuracy, and to provide analysis environments based on customer needs or situation and feed back to field quickly.

The most commonly used technique to solve these problems is called "parallel distributed learning," which processes deep learning at high speed. Learning of a large volume of data by only one computer is very time consuming. This task is accomplished in parallel by plural computers that are connected by Ethernet or other networks. By dispersing the arithmetic loads of CPUs and GPUs related to calculations, the overall processing time can be shortened and learning can be accomplished faster.

Adopting "Application of Deep Learning to IoT," Toshiba has been developing related technologies and has repeated a variety of simulations and accumulated experience from trial and error. During this process, Toshiba has found bottlenecks in conventional parallel distributed learning platform.

Click here to move to the top of this page.

Mere Connection of Computers
Does Not Increase the Learning Speed of Deep Learning

Figure 1 shows the performance simulation of deep learning by conventional parallel distributed learning to process one year time series sensor data of more than 30,000 kinds stored at Toshiba Smart Community Center (LAZONA Kawasaki Toshiba Building in Kanagawa, Japan). The vertical axis indicates the calculation performance for one node as a criterion, while the horizontal axis indicates the number of calculation nodes (four GPUs are installed per node) for computational processing. This diagram shows that the processing speed did not increase linearly proportionate to the increase in the number of calculation nodes but that the speed of processing reached a ceiling at some point. When computers are connected by 40Gb Ethernet, the performance variation is moderate, while the performance peak is reached only with two nodes when computers are connected by 1Gb Ethernet. This means that conventional parallel distributed learning is inappropriate to circulate an efficient and effective cycle of creation, operation and feeding back of learning models for deep learning utilization in the industrial region that handle a large volume of IoT data.

Fig. 1 Transferring a learning model via Ethernet by conventional parallel distributed learning technology

In parallel distributed learning, learning models learned by individual computers are integrated and updated. Therefore, data is exchanged frequently between the parameter server and other computers. The large volume of data exchanged among the calculation nodes increases the more the number of calculation nodes is increased to meet the large volume of data, resulting in the efficiency of the entire system to downgrade due to a limit on the network capacity. Viewing this network overhead as a major factor that reduces the calculation efficiency, Toshiba has started to search for a new system to solve this bottleneck of parallel distributed learning.

Click here to move to the top of this page.

Developed Platform for High-Speed Parallel Distributed Learning
Using All Flash Storage as Infrastructure

Toshiya Takano

Diverse data of a large volume collected in time series from IoT is analyzed efficiently by deep learning. Efficient analysis of such data has depended on solving bottlenecks of conventional parallel distributed learning, to raise the learning speed in proportion to the increase in the number of calculation nodes. Toshiba focused on high-speed all flash storage, a product of Dell EMC in Dell Technologies Group. This storage features a large capacity by integrating flash memories of the NAND type in an ultrahigh density and high-speed access by connecting a wide bandwidth of maximum 100GB/s as the backplane and it is connected to plural computers with dedicated lines. This storage takes pride in high speed and reliability, which are needed in supporting mission-critical systems of enterprises and social infrastructure.

Instead of the Ethernet connection of the computers, Toshiba adopts this storage as shared storage for learning models exchanged among the computers. (Fig. 2) Using the high-speed access feature of the storage, Toshiba aimed at creating a platform for high-speed, high-efficiency parallel distributed learning with small overheads for synchronization and updating of learning models.

Fig. 2 New parallel distributed learning technology that shares learning models in high-speed storage

Fig. 2 illustrates the simulation results of the relationship between the number of calculation nodes and processing speed when the High-Speed Storage is installed as infrastructure under the condition that the scale of data is as large as that of the Toshiba Smart Community Center. It can be observed that the processing speed increases linearly without saturation up to nine nodes and that the performance degrades only about 10% even with ten nodes. This result encouraged Toshiba in believing that its deep learning technology and parallel distributed learning that uses the high-speed storage of Dell EMC can fully accomplish deep learning for large scale data and feed back to fields in applying the combined technology to the industrial region also.

Click here to move to the top of this page.

Approved as IIC Testbed,
Verification Started at Smart Community Center

For the purpose of investigating and verifying a parallel distributed learning with high speed and high efficiency for a large volume of IoT data, Toshiba proposed a deep learning testbed "Deep Learning Facility" to Industrial Internet Consortium jointly developed with Dell EMC. It became the first deep learning platform to be approved by IIC in October 2016. The testbed that focuses on deep learning is the first case approved by the IIC, an international group that promotes de facto standardization of IoT utilization in the industrial region.

After receiving this approval, Dell EMC and Toshiba started verification of the "Deep Learning Facility" at the Smart Community Center of Toshiba. Tens of thousands of kinds of time series data such as data of the environment inside rooms, electric power consumption, people entering and exiting buildings and other scenes are collected through sensors of building management systems, air conditioning equipment and security gates. Learning models are quickly created through parallel distributed learning that uses the High-Speed Storage. Toshiba is verifying the usefulness of deep learning using its IoT platform. This is performed by optimizing the maintenance of sensors, monitoring equipment and other devices while improving facility operation rate by quickly feeding back failure symptoms to fields.

This verification is planned to be continued until September 2017. Dell EMC and Toshiba expect that the new parallel distributed learning technology will bring the best practice of deep learning applications for IoT in order to deploy it as a service. Toshiba believes that deep learning will expand its applications in the industrial region as solutions enabling efficient management and control in building facilities, manufacturing fields and social infrastructures.

* The corporate names, organization names, job titles and other names and titles appearing in this article are those as of February 2017.

Related articlesVol.20: Artificial Intelligence Discovering Values of IoT Data -Analytics towards Deep Learning-