cluster – Preferred Networks, Inc. https://www.preferred.jp Wed, 12 Aug 2020 00:37:30 +0000 en-US hourly 1 https://wordpress.org/?v=5.2.9 https://www.preferred.jp/wp-content/uploads/2019/08/favicon.png cluster – Preferred Networks, Inc. https://www.preferred.jp 32 32 Preferred Networks’ MN-3 Tops Green500 List of World’s Most Energy-Efficient Supercomputers https://www.preferred.jp/en/news/pr20200623/ https://www.preferred.jp/en/news/pr20200623/#respond Tue, 23 Jun 2020 04:30:47 +0000 https://preferred.jp/?p=14317 TOKYO – June 23, 2020 – Preferred Networks, Inc. (PFN) and Kobe University announced today that MN-3, PFN’s de […]

投稿 Preferred Networks’ MN-3 Tops Green500 List of World’s Most Energy-Efficient SupercomputersPreferred Networks, Inc. に最初に表示されました。

]]>
TOKYO – June 23, 2020 – Preferred Networks, Inc. (PFN) and Kobe University announced today that MN-3, PFN’s deep learning supercomputer, topped the latest Green500 list of the world’s most energy-efficient supercomputers. MN-3 is powered by MN-Core™, a highly efficient custom processor co-developed by PFN and Kobe University specifically for use in deep learning.

MN-3

PFN’s MN-3 deep learning supercomputer

MN-3 has achieved an energy efficiency of 21.11 gigaflops-per-watt (Gflops/W) based on the industry’s standard high-performance Linpack (HPL) benchmark, meaning it performed 21.11 billion calculations for every watt of power consumed in one second. The achievement is 15% higher than the previous Green500 record of 18.404 Gflops/W, which was recorded in June 2018. This demonstrates that MN-Core and MN-3 are leading the global competition for energy-efficient supercomputers specialized for deep learning.

MN-3, which is located in Japan Agency for Marine-Earth Science and Technology (JAMSTEC)’s Simulator Building at Yokohama Institute for Earth Sciences, started operation in May 2020. The system used for MN-3’s performance measurement consisted of 40 nodes and 160 MN-Core processors.

  • Peak performance (theoretical): 3.92 Pflops
  • Speed of solving linear simultaneous equations (HPL Benchmark): 1.62 Pflops
  • Performance for every watt of power consumed: 21.11 Gflops/W

https://www.top500.org/system/179806/

Note: The TOP500 entry states that MN-3 has 2,080 cores. This number consists of 160 MN-Core processors, counted as one core each, and 1,920 Intel Xeon processors. MN-Core performs most of the computations for the HPL benchmark measurement.

The key elements that contributed to the achievement are as follows.

1. MN-Core
MN-Core, developed by PFN and Kobe University with support from RIKEN AICS/R-CCS, is equipped with highly efficient compute units designed specifically for deep learning.

MN-Core board

2. MN-Core DirectConnect
The MN-Core DirectConnect interconnect facilitates high-speed, high-efficiency data transmission between the nodes.

3. Optimization techniques for efficient workload management
The performance optimizations used in the HPL Benchmark can also be used to speed up deep learning workloads.

4. High compute-density and locality
The energy efficiency was maximized by densely integrating multiple MN-Core dies onto each board.

The technologies that drastically reduced the environmental impact and operation costs are expected to become a foundation for highly efficient information systems in general as well as supercomputers of the next generation.

PFN plans to further increase MN-3’s energy efficiency by improving the installation methods, cooling and MN-Core-specific middleware.

For more information about PFN’s supercomputers, visit: https://projects.preferred.jp/supercomputers/en/

投稿 Preferred Networks’ MN-3 Tops Green500 List of World’s Most Energy-Efficient SupercomputersPreferred Networks, Inc. に最初に表示されました。

]]>
https://www.preferred.jp/en/news/pr20200623/feed/ 0
Preferred Networks builds MN-2, a state-of-the-art supercomputer powered with NVIDIA GPUs. https://www.preferred.jp/en/news/pr20190318/ Sun, 17 Mar 2019 23:30:15 +0000 https://www.preferred-networks.jp/ja/?p=11657 It will become operational in July to provide a combined computing power of 200*1 PetaFLOPS*2.   March 18 […]

投稿 Preferred Networks builds MN-2, a state-of-the-art supercomputer powered with NVIDIA GPUs.Preferred Networks, Inc. に最初に表示されました。

]]>
It will become operational in July to provide a combined computing power of 200*1 PetaFLOPS*2.

 

March 18, 2019, Tokyo Japan – Preferred Networks, Inc. (PFN, Head Office: Tokyo, President & CEO: Toru Nishikawa) will independently build a new private supercomputer called MN-2 and start operating it in July 2019.

MN-2 is a cutting-edge multi-node GPGPU*3 computing platform, using NVIDIA(R) V100 Tensor Core GPUs. This, combined with two other PFN private supercomputers ― MN-1 (in operation since September 2017) and MN-1b (in operation since July 2018), will provide PFN with total computing resources of about 200 PetaFLOPS. PFN also plans to start operating MN-3, a private supercomputer with PFN’s proprietary deep learning processor MN-Core(TM), in spring 2020.

By continuing to invest in computing resources, PFN will further accelerate practical applications of research and development in deep learning technologies and establish a competitive edge in the global development race.

Conceptual image of the completed MN-2

 

Outline of PFN’s next-generation private supercomputer MN-2

MN-2 is PFN’s private supercomputer equipped with 5,760 latest CPU cores as well as 1,024 NVIDIA V100 Tensor Core GPUs and will be fully operational in July 2019. MN-2 is to be built on the premises of Yokohama Institute for Earth Sciences, Japan Agency for Marine-Earth Science and Technology. MN-2 will not only work with MN-3, which is scheduled to start operation in 2020 on the same site, but also connect with MN-1 and MN-1b, MN-2’s predecessors that are currently up and running, in a closed network. MN-2 can theoretically perform about 128 PetaFLOPS in mixed precision calculations, a method used in deep learning. This means that MN-2 alone has more than double the peak performance of MN-1b.

Each node on MN-2 has four 100-gigabit Ethernets, in conjunction with the adoption of RoCEv2*4, to interconnect with other GPU nodes. The uniquely tuned interconnect realizes high-speed, multi-node processing. Concurrently, PFN will self-build software-defined storage*5 with a total capacity of over 10PB and optimize data access in machine learning to speed up the training process.

PFN will fully utilize the open-source deep learning framework Chainer(TM) on MN-2 to further accelerate research and development in fields that require a large amount of computing resources such as personal robots, transportation systems, manufacturing, bio/healthcare, sports, and creative industries.

Comments from Takuya Akiba,
Corporate Officer, VP of Systems, Preferred Networks, Inc.

“We have been utilizing large-scale data centers with the state-of-the-art NVIDIA GPUs to do research and development on deep learning technology and its applications. High computational power is one of the major pillars of deep learning R&D. We are confident that the MN-2 with 1,024 NVIDIA V100s will further accelerate our R&D.”

 

Comments from Masataka Osaki,
Japan Country Manager, Vice President of Corporate Sales, NVIDIA

“NVIDIA is truly honored that Preferred Networks has chosen NVIDIA V100 for the MN-2, in addition to the currently operating MN-1 and MN-1b, also powered with our cutting-edge GPUs for data centers. We anticipate that the MN-2, accelerated by NVIDIA’s flagship product with high-speed GPU interconnect NVLink, will spur R&D of deep learning technologies and produce world-leading solutions.”

*1: The figure for MN-1 is the total PetaFLOPS in half precision. For MN-1b and MN-2, the figures are PetaFLOPS in mixed precisions. Mixed precisions are the combined use of more than one precision formats of floating-point operations.

*2: PetaFLOPS is a unit measuring computer performance. Peta is 1,000 trillion (10 to the power of 15) and FLOPS is used to count floating-point operations per second. Therefore, 1 PetaFLOPS means that a computer is capable of performing 1,000 trillion floating-point calculations per second.

*3: General-purpose computing on GPU

*4: RDMA over Converged Ethernet. RoCEv2 is one of the network protocols for direct memory access between remote nodes (RDMA) and a method to achieve low latency and high throughput on the Ethernet.

*5: Software-defined storage is a storage system in which software is used to centrally control distributed data storages and increase their utilization ratios.

 

*MN-Core(TM) and Chainer(TM) are the trademarks or registered trademarks of Preferred Networks, Inc. in Japan and elsewhere.

投稿 Preferred Networks builds MN-2, a state-of-the-art supercomputer powered with NVIDIA GPUs.Preferred Networks, Inc. に最初に表示されました。

]]>
Preferred Networks develops a custom deep learning processor MN-Core for use in MN-3, a new large-scale cluster, in spring 2020 https://www.preferred.jp/en/news/pr20181212/ Wed, 12 Dec 2018 03:00:53 +0000 https://www.preferred-networks.jp/ja/?p=11538 Dec. 12, 2018, Tokyo Japan – Preferred Networks, Inc. (“PFN”, Head Office: Tokyo, President & CEO: Toru Ni […]

投稿 Preferred Networks develops a custom deep learning processor MN-Core for use in MN-3, a new large-scale cluster, in spring 2020Preferred Networks, Inc. に最初に表示されました。

]]>
Dec. 12, 2018, Tokyo Japan – Preferred Networks, Inc. (“PFN”, Head Office: Tokyo, President & CEO: Toru Nishikawa) announces that it is developing MN-Core (TM), a processor dedicated to deep learning and will exhibit this independently developed hardware for deep learning, including the MN-Core chip, board, and server, at the SEMICON Japan 2018, held at Tokyo Big Site.  


With the aim of applying deep learning in the real world, PFN has developed the Chainer (TM) open source deep learning framework and built powerful GPU clusters MN-1 and MN-1b, which support its research and development activities. By using these clusters with the innovative software to conduct large-scale distributed deep learning, PFN is accelerating R&D in various areas, such as autonomous driving, intelligent robots, and cancer diagnosis and increasing efforts to put these R&D results to practical use.

To speed up the training phase in deep learning, PFN is currently developing the MN-Core chip, which is dedicated and optimized for performing matrix operations, a process characteristic of deep learning. MN-Core is expected to achieve a world top-class performance per watt of 1 TFLOPS/W (half precision). Today, floating-point operations per second per watt is one of the most important benchmarks to consider when developing a chip. By focusing on minimal functionalities, the dedicated chip can boost effective performance in deep learning as well as bringing down costs.

  • Specifications of the MN-Core chip
    • Fabrication Process : TSMC 12nm
    • Estimated power consumption (W) : 500
    • Peak performance (TFLOPS) :   32.8(DP) / 131(SP) / 524 (HP)
    • Estimated performance per watt (TFLOPS / W) : 0.066 (DP)/ 0.26(SP) / 1.0(HP)

(Notes) DP: double precision, SP: single precision, HP: half precision

https://projects.preferred.jp/mn-core/en/

 

Further improvement in the accuracy and computation speed of pre-trained deep learning models is an essential prerequisite for PFN to work on more complex problems that remain unsolved. It is therefore important to make continued efforts to increase computing resources and make them more efficient. PFN plans to build a new large-scale cluster loaded with MN-Cores, named MN-3, with plans to operate it in the spring of 2020. MN-3 comprises more than 1,000 dedicated server nodes, and PFN intends to increase its computation speed to a target of 2 EFLOPS eventually.

For MN-3 and subsequent clusters, PFN aims to build more efficient computing environments by making use of MN-Core and GPGPU (general-purpose computing on GPU) according to their respective fields of specialty.   

Furthermore, PFN will advance the development of the Chainer deep learning framework so that MN-Core can be selected as a backend, thus utilizing both software and hardware approaches to drive innovations based on deep learning.

 

PFN’s self-developed hardware for deep learning, including MN-Core, will be showcased at its exhibition booth at the SEMICON Japan 2018.

  • PFN exhibition booth at SEMICON Japan 2018
    • Dates/Time: 10:00 to 17:00 Dec. 12 – 14, 2018
    • Venue: Booth #3538, Smart Applications Zone, East Hall 3 at Tokyo Big Site
    • Exhibits:
      (1)  Deep Learning Processor MN-Core, Board, Server
      (2) Preferred Networks Visual Inspection
      (3) Preferred Networks plug&pick robot

 

*MN-Core (TM) and Chainer (TM) are the trademarks or the registered trademarks of Preferred Networks, Inc. in Japan and elsewhere.

投稿 Preferred Networks develops a custom deep learning processor MN-Core for use in MN-3, a new large-scale cluster, in spring 2020Preferred Networks, Inc. に最初に表示されました。

]]>
Preferred Networks wins second place in the Google AI Open Images – Object Detection Track, competed with 454 teams https://www.preferred.jp/en/news/pr20180907/ Fri, 07 Sep 2018 02:00:05 +0000 https://www.preferred-networks.jp/ja/?p=11417 Sept. 7, 2018, Tokyo Japan – Preferred Networks, Inc. (PFN, Headquarters: Chiyoda-ku, Tokyo, President and CEO […]

投稿 Preferred Networks wins second place in the Google AI Open Images – Object Detection Track, competed with 454 teamsPreferred Networks, Inc. に最初に表示されました。

]]>
Sept. 7, 2018, Tokyo Japan – Preferred Networks, Inc. (PFN, Headquarters: Chiyoda-ku, Tokyo, President and CEO: Toru Nishikawa) participated in the Google AI Open Images – Object Detection Track, an object detection challenge hosted by Kaggle*1, and won second place in the competition among 454 teams from around the world.

 

Object detection, which is one of the major research subjects in computer vision, is a basic technology that is critical for autonomous driving and robotics. Challenges in using large-scale datasets, such as ImageNet and MS COCO, to achieve better accuracy in object detection have been the unifying force of the research community, contributing to the rapid improvement of detection techniques and algorithms.

 

The Google AI Open Images – Object Detection Track held between July 3, 2018 and August 30, 2018 was a competition of an unprecedented scale that used Open Images V4*2, a large and complex dataset released by Google this year. As a result, the event attracted the attention of many researchers. A total of 454 teams from around the world participated in the competition.

PFN entered the competition as team “PFDet”, comprising interested members, mainly developers of ChainerMN and ChainerCV, PFN’s distributed deep learning library and computer vision library based on deep learning, respectively, as well as specialists in the fields of autonomous driving and robotics. During the competition, PFN’s large-scale cluster MN-1b that has 512 NVIDIA (R) Tesla(R) V100 32GB GPUs was in full operation for the first time since its launch in July this year. In addition, the team utilized a parallel deep learning technique to speed up training with a large-scale dataset and made full use of research results PFN had accumulated over the years in the fields of autonomous driving and robotics.  These efforts resulted in the team finishing in a close second place by a narrow margin of 0.023% behind the team who won first place.

 

We have published a paper, entitled “PFDet: 2nd Place Solution to Open Images Challenge 2018 Object Detection Track,” regarding our solution method in this competition, at https://arxiv.org/abs/1809.00778

We also plan to present the content of the paper at a workshop at the European Conference on Computer Vision (ECCV)2018.

 

A part of the techniques developed for this competition will be released as additional functionality to ChainerMN and ChainerCV.

 

PFN will continue to work on research and development of image analysis and object detection technologies, and promote their practical applications in our three primary business domains, namely, transportation, manufacturing, and bio/healthcare.

 

*1:A platform for machine learning competitions

*2:A very large training dataset comprised of 1.7 million images (including 12 million objects of 500 classes)

投稿 Preferred Networks wins second place in the Google AI Open Images – Object Detection Track, competed with 454 teamsPreferred Networks, Inc. に最初に表示されました。

]]>
Preferred Networks to Launch “MN-1b” Private Sector Supercomputer Adopting NVIDIA Tesla V100 32GB GPUs Will expand NTT Com Group’s multi-node GPU platform https://www.preferred.jp/en/news/pr20180328/ Wed, 28 Mar 2018 01:00:23 +0000 https://www.preferred-networks.jp/ja/?p=11180 TOKYO, JAPAN — Preferred Networks, Inc. (PFN), a provider of IoT-centric deep learning systems, NTT Communicat […]

投稿 Preferred Networks to Launch “MN-1b” Private Sector Supercomputer Adopting NVIDIA Tesla V100 32GB GPUs Will expand NTT Com Group’s multi-node GPU platformPreferred Networks, Inc. に最初に表示されました。

]]>
TOKYO, JAPAN — Preferred Networks, Inc. (PFN), a provider of IoT-centric deep learning systems, NTT Communications Corporation (NTT Com), the ICT solutions and international communications business within the NTT Group, and NTT Com subsidiary NTT PC Communications Incorporated (NTT PC) announced today that PFN will launch an expanded version of its MN-1 private sector supercomputer equipped with NTT Com and NTTPC’s next-generation GPU platform by July. The new MN-1b supercomputer will adopt the NVIDIA(R) Tesla(R) V100 32GB, that was announced at GTC 2018 on March 27, 2018 (U.S. time).

PFN plans to enhance MN-1 by adding 512 NVIDIA Tesla V100 32GB GPUs and have them up and running by July, with the added GPUs having a theoretical peak performance of about 56 PetaFLOPS1, a massive 56,000 trillion floating-point operations per second, based on a mixed precision floating-point operation2 used in deep learning. This means the expansion alone will contribute to a roughly threefold increase from the current peak.

PFN expects the new supercomputer’s extra high speed and massive processing environment leveraging the latest GPUs will accelerate the real-world applications of its research and development in deep learning and related technologies and thereby strengthen PFN’s global competitiveness. NTT Com and NTT PC will build and operate the multi-node platform leveraging the latest GPUs that meets PFN’s requirements, using their knowledge of intra-GPU communication and waste heat processing.

“We are truly honored that Preferred Networks has chosen NVIDIA Tesla V100 32GB, most advanced data center GPU with 2X the memory, for its next-generation private supercomputer’s computation environment, “MN-1b”. With NTT Com Group’s experience of establishing and managing highly reliable data center services, combined with NVIDIA’s latest high-speed GPUs for deep learning, we sincerely look forward to R&D results in the fields of transportation systems, manufacturing and biotech/healthcare.”
said Masataka Osaki, Vice President of Corporate Sales and NVIDIA Japan Country Manager.

Emmy Chang, Board Director, Supermicro KK and VP of Strategic Sales, Supermicro said
“Preferred Networks is the first in the world to deploy our SuperServer(R) 4029GP-TRT2 equipped with the latest version of Intel(R) Xeon(R) Scalable processors and supporting eight NVIDIA Tesla V100 32GB GPU accelerators,” “Preferred Networks has developed the world-class private supercomputer through cooperative work with NTT Com Group, and Supermicro continues to support them with our latest innovative hardware and solutions.  We are confident that Preferred Networks will achieve new heights with its new private supercomputer.”

 

PFN will use the new MN-1b to raise the speed of its ChainerTM open source deep-learning framework and further accelerate its research and development in fields that require a huge amount of computing resources, namely transportation systems, manufacturing, bio-healthcare, and creativity.

Going forward, NTT Com expects to increasingly support the delivery of AI technologies and related platforms for advanced research and commercialized deep learning, including the AI business initiatives of PFN.

 

Related links:

Chainer:

Enterprise Cloud:

Nexcenter:

 

Notes:

1 A unit measuring computer performance. Peta is 1,000 trillion (10 to the power of 15) and FLOPS is used to count floating-point operations per second. So, 1 PetaFLOPS means that a computer is capable of performing 1,000 trillion floating-point calculations per second.

2 Mixed precision floating-point operation is a method of floating point arithmetic operations with a combination of multiple precisions.

ChainerTM is a trademark or a registered trademark of Preferred Networks, Inc. in Japan and other countries. Other company names and product names written in this release are the trademarks or the registered trademarks of each company.

 

About Preferred Networks, Inc.

Founded in March 2014 with the aim of promoting business utilization of deep learning technology focused on IoT, PFN advocates Edge Heavy Computing as a way to handle the enormous amounts of data generated by devices in a distributed and collaborative manner at the edge of the network, driving innovation in three priority business areas: transportation, manufacturing and bio/healthcare. PFN develops and provides Chainer, an open source deep learning framework. PFN promotes advanced initiatives by collaborating with world leading organizations, such as Toyota Motor Corporation, Fanuc Corporation and the National Cancer Center.

https://www.preferred.jp/

 

About NTT Communications Corporation

NTT Communications provides consultancy, architecture, security and cloud services to optimize the information and communications technology (ICT) environments of enterprises. These offerings are backed by the company’s worldwide infrastructure, including the leading global tier-1 IP network, the Arcstar Universal One™ VPN network reaching over 190 countries/regions, and over 140 secure data centers worldwide. NTT Communications’ solutions leverage the global resources of NTT Group companies including Dimension Data, NTT DOCOMO and NTT DATA.
www.ntt.com | Twitter@NTT Com | Facebook@NTT Com | LinkedIn@NTT Com

 

NTT PC Communications Incorporated

NTTPC Communications Incorporated (NTTPC), established in 1985 is a subsidiary of NTT Communications, is a network service and communication solution provider in Japanese telco market, The company has been the most strategic technology company of the group throughout of years. NTTPC launched the 1st ISP service of the NTT group, so called “InfoSphere” at 1995, and also launched the 1st Internet Data Center and server hosting services of Japan so called “WebARENA” at 1997. NTTPC have always started something new in ICT market.

Preferred Networks’ private supercomputer ranked first in the Japanese industrial supercomputers TOP 500 list

Preferred Networks Launches one of Japan’s Most Powerful Private Sector Supercomputers

投稿 Preferred Networks to Launch “MN-1b” Private Sector Supercomputer Adopting NVIDIA Tesla V100 32GB GPUs Will expand NTT Com Group’s multi-node GPU platformPreferred Networks, Inc. に最初に表示されました。

]]>