提交 e6063f6c 编写于 作者: Y Yi Wang

Design Doc: New Building System

上级 0a4b540a
A few months ago when we were trying to replace CMake with Bazel, @emailweixu suggested that we rewrite those handy Bazel functions using CMake. Now it seems that it's the right time to get this done, as we are facing problems from the porting of Majel and the development of new the parameter server using Go and C++.
Here are some initial thoughts. Your comments are welcome!
### Required CMake Function
I think we need only the following few CMake functions to make a project description mean and clean:
| C++ | CUDA C++ | Go |
|---|---|---|
| cc_library | nv_library | go_library |
| cc_binary | nv_binary | go_binary |
| cc_test | nv_test | go_test |
- The `_library` functions generate .a files from source code.
- The `_binary` functions generate executable binary files.
- The `_test` functions generate executable unit test files. They work like `_binary` but links `-lgtest` and `-lgtest_main`.
The difference between `nv_` functions and `cc_` functions is that the former use `nvcc` instead of the system-default C++ compiler.
Both `nv_` and `cc_` functions enables C++11 (-std=c++11).
In addition,
- to describe external dependencies, we need `external_library`.
- to build shared libraries, we need `shared_library`.
### An Example Project
Suppose that we have aforementioned functions defined in our `/cmake` directory. The following example `CMakeLists.txt` describes a project including the following source files:
- tensor.h
- tensor.cc
- tensor_test.cc
- ops.h
- ops.cu
- ops_test.cu
- api.go
- api_test.go
Suppose that ops.cu depends on CUDNN.
```cmake
# cc_binary parses tensor.cc and figures out that target also depend
# on tensor.h.
cc_binary(tensor
SRCS
tensor.cc)
# The dependency to target tensor implies that if any of
# tensor{.h,.cc,_test.cc} is changed, tensor_test need to be re-built.
cc_test(tensor_test
SRCS
tensor_test.cc
DEPS
tensor)
# I don't have a clear idea what parameters external_library need to
# have. @gangliao as a CMake expert would have better ideas.
external_library(cudnn
....)
# Suppose that ops.cu depends on external target CUDNN. Also, ops.cu
# include global functions that take Tensor as their parameters, so
# ops depend on tensor. This implies that if any of tensor.{h.cc},
# ops.{h,cu} is changed, ops need to be re-built.
nv_library(ops
SRCS
ops.cu
DEPS
tensor
cudnn) # cudnn is defined later.
nv_test(ops_test
SRCS
ops_test.cu
DEPS
ops)
# Because api.go defines a GO wrapper to ops and tensor, it depends on
# both. This implies that if any of tensor.{h,cc}, ops.{h,cu}, or
# api.go is changed, api need to be re-built.
go_library(api
SRCS
api.go
DEPS
tensor # Because ops depend on tensor, this line is optional.
ops)
go_test(api_test
SRCS
api_test.go
DEPS
api)
# This builds libapi.so. shared_library might use CMake target
# api_shared so to distinguish it from above target api.
shared_library(api
DEPS
api)
```
### Implementation
As above example CMakeLists.txt executes, each function invocation
adds "nodes" to a dependency graph. It also use this graph to
generate CMake commands including `add_executable`,
`add_dependencies`, `target_link_libraries`, and `add_test`.
# Design Doc: `go_{library,binary,test}`
## Concerns
1. Need to build Go libraries callable from Go and from C.
For usual Go libraries, the bulding command line is as
```bash
go build foo.go bar.go -o libfoobar.a
```
For Go libraries callable from C/C++, the command line is
```bash
go build -buildmode=c-archive foo.go bar.go -o libstatic.a
```
or for shared libraries:
```bash
go build -buildmode=c-shared foo.go bar.go -o libdynamic.so
```
and `foo.go`, `bar.go`, etc must start with a line `package main`,
which defines all symbols in special pacakge `main`. There also
must be a `func main`, which could have empty body.
1. Need to support building-in-Docker.
We are going to support two ways to building -- (1) in Ubuntu, and
(2) in Docker container whose base image is Ubuntu.
The challenge is (2), because to build in Docker, we run the
development image as:
```bash
git clone https://github.com/PaddlePaddle/Paddle -o paddle
cd paddle
docker run -v $PWD:/paddle paddle:dev
```
which maps the local repo to `/paddle` in the container.
This assumes that all Go code, including third-party packages, must
be in the local repo. Actually, it assumes that `$GOPATH` must be
in the local repo. This would affect how we write `import`
statemetns, and how we maintain third-party packages.
## A Solution
This might not be the optimal solution. Comments are welcome.
### Directory structure
We propose the following directory structure:
```
https://github.com/PaddlePaddle/Paddle
↓ git clone
~/work/paddle/go/pkg1/foo.go
/pkg2/bar.go
/cmd/cmd1/wee.go
/cmd/cmd2/qux.go
/github.com/someone/a_3rd_party_pkg (Git submodule to a 3rd-party pkg)
/golang.org/another_3rd_party_pkg (Git submodule to another one)
/build/go ($GOPATH, required by Go toolchain)
/src (symlink to ~/work/paddle/go/)
/pkg (libraries built by Go toolchain)
/bin (binaries bulit by Go toolchain)
```
Above figure explains how we organize Paddle's Go code:
1. Go source code in Paddle is in `/go` of the repo.
1. Each library package is a sub-directory under `/go`.
1. Each (executable) binary package is a sub-directory under
`/go/cmd`. This is the source tree convention of Go itself.
1. Each 3rd-party Go package is a Git submodule under `/go`.
These rules make sure that all Go source code are in `/go`.
At build-time, Go toolchain requires a directory structure rooted at
`$GOPATH` and having three sub-directories: `$GOPATH/src`,
`$GOPATH/pkg`, and `$GOPATH/bin`, where `$GOPATH/src` is the source
tree and the root of Go's `import` paths.
For example, if `/go/pkg1/foo.go` contains `import
"github.com/someone/a_3rd_party_pkg"`, the Go toolchain will find the
package at `$GOPATH/src/github.com/someone/a_3rd_party_pkg`.
In order to create such a `$GOPATH`, our build system creates
`/build/go`. Remeber that we want to make sure that all output files
generated at build-time are place in `/build`.
Under `/build/go`, our build system creates a symoblic link `src`
pointing to `/go`, where all Go source code resides.
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册