Caffe2 + PyTorch = PyTorch 1.0


Announcing PyTorch 1.0 for both research and production

The path for taking AI development from research to production has historically involved multiple steps and tools, making it time-intensive and complicated to test new approaches, deploy them, and iterate to improve accuracy and performance. To help accelerate and optimize this process, we’re introducing PyTorch 1.0, the next version of our open source AI framework.

PyTorch 1.0 takes the modular, production-oriented capabilities from Caffe2 and ONNX and combines them with PyTorch’s existing flexible, research-focused design to provide a fast, seamless path from research prototyping to production deployment for a broad range of AI projects. With PyTorch 1.0, AI developers can both experiment rapidly and optimize performance through a hybrid front end that seamlessly transitions between imperative and declarative execution modes. The technology in PyTorch 1.0 has already powered many Facebook products and services at scale, including performing 6 billion text translations per day.

PyTorch 1.0 will be available in beta within the next few months, and will include a family of tools, libraries, pre-trained models, and datasets for each stage of development, enabling the community to quickly create and deploy new AI innovations at scale.

The path from research to production

PyTorch’s imperative front end allows for more rapid prototyping and experimentation through its flexible and productive programming model. The first version of PyTorch launched a little over a year ago, and its speed, productivity, and ability to support cutting-edge AI models such as dynamic graphs quickly made it a popular and important development tool for AI researchers. It has more than 1.1 million downloads and is the second-most cited deep learning framework on arxiv over the last month. For example, UC Berkeley computer scientists put PyTorch’s dynamic graph capabilities to use for their noteworthy CycleGAN image-to-image transform work.

Although the current version of PyTorch has provided great flexibility for AI research and development, performance at production-scale is sometimes a challenge, given its tight coupling to Python. We often need to translate the research code — either training script or trained model — to a graph mode representation in Caffe2 to run at production scale. Caffe2’s graph-based executor allows developers to take advantage of state-of-the-art optimizations like graph transformations, efficient memory reuse, and tight hardware interface integration. The Caffe2 project was launched two years ago to standardize our production AI tooling, and is now running neural networks across Facebook servers and on more than 1 billion phones around the world, spanning eight generations of iPhones and six generations of Android CPU architectures. Today, Caffe2 delivers more than 200 trillion predictions per day across all models, small and large, with optimized production performance.

The migration from PyTorch to Caffe2 to ship to production used to be a manual process, time-intensive, and error-prone. To solve this problem, we partnered with major hardware and software companies to create ONNX (Open Neural Network Exchange), an open format for representing deep learning models. With ONNX, developers can share models among different frameworks, for example, by exporting models built in PyTorch and importing them to Caffe2. At Facebook, this enabled us to have smoother AI research, training and inference with large-scale server and mobile deployment.

We’ve used these tools (PyTorch, Caffe2, and ONNX) to build and deploy Translate, the tool that now runs at scale to power translations for the 48 most commonly used languages on Facebook. In VR, these tools have been critical in deploying new research from Oculus into production to make avatars move more realistically.

However, while this combination of three different tools has been effective, there are still manual steps that are complicated and time-consuming. As such, it didn’t allow us to bring new AI research innovation to production as seamlessly as we would have liked.

Unifying research and production capabilities in one framework

PyTorch 1.0 fuses together immediate and graph execution modes, providing both flexibility for research and performance optimization for production. More specifically, rather than force developers to do an entire code rewrite to optimize or migrate from Python, PyTorch 1.0 provides a hybrid front end enabling you to seamlessly share the majority of code between immediate mode for prototyping and graph execution mode for production.

In addition, ONNX is natively woven into PyTorch 1.0 as the model export format, making models from PyTorch 1.0 interoperable with other AI frameworks. ONNX also serves as the integration interface for accelerated runtimes or hardware-specific libraries. This gives developers full freedom to mix and match the best AI frameworks and tools without having to take on resource-intensive custom engineering. Facebook is committed to supporting new features and functionalities for ONNX, which continues to be a powerful open format as well as an important part of developing with PyTorch 1.0.

Building an end-to-end deep learning system

Along with PyTorch 1.0, we’ll also open-source many of the AI tools we are using at scale today. These include Translate — a PyTorch Language Library — for fast, flexible neural machine translation, as well as the next generation of ELF, a comprehensive game platform for AI reasoning applications. Developers can also take advantage of tools like Glow, a machine learning compiler that accelerates framework performance on different hardware platforms, and Tensor Comprehensions, a tool that automatically generates efficient GPU code from high-level mathematical operations. We have also open-sourced other libraries, such as Detectron, which supports object-detection research, covering both bounding box and object instance segmentation outputs. Visit our AI developer site at for the full list, and learn more about PyTorch on the PyTorch and Caffe2 blogs.

Over the coming months, we’re going to refactor and unify the codebases of both the Caffe2 and PyTorch 0.4 frameworks to deduplicate components and share abstractions. The result will be a unified framework that supports efficient graph-mode execution with profiling, mobile deployment, extensive vendor integrations, and more. As with other open AI initiatives like ONNX, we’re also partnering with other companies and the community to give more developers these accelerated research to production capabilities. To start, Microsoft plans to support PyTorch 1.0 in their Azure cloud and developer offerings, including Azure Machine Learning services and Data Science Virtual Machines, and Amazon Web Services currently supports the latest version of PyTorch, optimized for P3 GPU instances, and plans to make PyTorch 1.0 available shortly after release in their cloud offerings, including its Deep Learning AMI (Amazon Machine Image).

This is just the beginning, as we look to create and share better AI programming models, interfaces and automatic optimizations. AI is a foundational technology at Facebook today, making existing products better and powering entirely new experiences. By opening up our work via papers, code, and models, we can work with all AI researchers and practitioners to advance the state of the art faster and to help apply these techniques in new ways.

PyTorch 1.0现身(Logo也换了),围棋AI开源



另外,Facebook开源了视频理解、自然语言处理的模型,开源了围棋AI ELF OpenGo,还展示了一个打星际的AI。


v0.4.0正式版发布没几天,Facebook在F8开发者大会第二天宣布将发布PyTorch 1.0,还提前展示了这款新框架的特性。


深度学习框架Caffe2的作者贾扬清,在知乎上将这一版本的发布总结为Caffe2 + PyTorch = PyTorch 1.0。


Facebook介绍说,PyTorch 1.0结合了Caffe2和ONNX模块化、面向生产的特性,和PyTorch自身灵活、面向研究的特性结合起来,为广泛的AI项目提供了一个从科研原型到生产部署的快速、无缝途径,让用户可以快速实验,通过一个能在强制执行模式和声明执行模式之间无缝切花的混合前端优化性能。

除了将研究和生产特性结合起来,PyTorch 1.0还将ONNX(开放神经网络交换)包含进来。ONNX是Facebook去年联合多家软硬件公司发布的神经网络模型转换协议,现在,它新增了对苹果的Core ML、百度PaddlePaddle、高通SNPE的支持,再加上原本支持的MXNet、Caffe2、PyTorch、TensorFlow、CNTK等框架,实现了神经网络模型在各种主流框架之间的转换。

PyTorch 1.0 beta版将在今年夏天和用户见面。





展示PyTorch 1.0的同时,Facebook还开源了一部分研究成果。比如用于视频理解的ResNext3D模型将于6月发布,视频行为识别模型Res 2+1今天就已经开源,PyTorch中的自然语言理解库Translate也将开源。



在F8大会上,还开源了一个围棋AI:ELF OpenGo。


和AlphaGo一样,这个AI的重点也并不只是下围棋,而是想要更好的解决问题。现在ELF OpenGo已经可以开源下载。



另外我们也和现在著名的LeelaZero比较了下。我们采用了LeelaZero除ponder外的缺省配置(约一分钟一步),及4月25日的公开权重(192×15, 158603eb),结果我们的AI以200比0获胜。在此我们非常感谢Leela团队的工作,对于他们的开源精神,我们表示由衷的敬意。



田渊栋,龚渠成&马子嫯(Jerry Ma), Shubho Sengupta, 陈卓远,Larry Zitnick

ELF OpenGo代码及模型的地址: