This document introduces how to use PaddleFL to train a model with Fl Strategy: Secure Aggregation. Using Secure Aggregation strategy, the server can aggregate the model parameters without learning the value of the parameters.
### Dependencies
- paddlepaddle>=1.6
### How to install PaddleFL
Please use python which has paddlepaddle installed
```
python setup.py install
```
### Model
The simplest Softmax regression model is to get features with input layer passing through a fully connected layer and then compute and ouput probabilities of multiple classes directly via Softmax function [[PaddlePaddle tutorial: recognize digits](https://github.com/PaddlePaddle/book/tree/develop/02.recognize_digits#references)].
### Datasets
Public Dataset [MNIST](http://yann.lecun.com/exdb/mnist/)
The dataset will downloaded automatically in the API and will be located under `/home/username/.cache/paddle/dataset/mnist`:
| train-images-idx3-ubyte | train data picture, 60,000 data |
| train-labels-idx1-ubyte | train data label, 60,000 data |
| t10k-images-idx3-ubyte | test data picture, 10,000 data |
| t10k-labels-idx1-ubyte | test data label, 10,000 data |
### How to work in PaddleFL
PaddleFL has two phases , CompileTime and RunTime. In CompileTime, a federated learning task is defined by fl_master. In RunTime, a federated learning job is executed on fl_server and fl_trainer in distributed clusters.
```
sh run.sh
```
#### How to work in CompileTime
In this example, we implement compile time programs in fl_master.py
```
python fl_master.py
```
In fl_master.py, we first define FL-Strategy, User-Defined-Program and Distributed-Config. Then FL-Job-Generator generate FL-Job for federated server and worker.
In fl_server.py, we load and run the FL server job.
```
server = FLServer()
server_id = 0
job_path = "fl_job_config"
job = FLRunTimeJob()
job.load_server_job(job_path, server_id)
server.set_server_job(job)
server.start()
```
In fl_trainer.py, we prepare the MNIST dataset, load and run the FL trainer job, then evaluate the accuracy. Before training , we first prepare the party's private key and other party's public key. Then, each party generates a random noise using Diffie-Hellman key aggregate protocol with its private key and each other's public key [1]. If the other party's id is larger than this party's id, the model parameters add this random noise. If the other party's id is less than this party's id, the model parameters subtract this random noise. So, and the model parameters is masked before uploading to the server. Finally, the random noises can be removed when aggregating the masked parameters from all the parties.
print("Test with Epoch %d, avg_cost: %s, acc: %s"%(epoch_id,avg_loss_val,acc_val))
ifepoch_id>40:
break
ifstep_i%100==0:
trainer.save_inference_program(output_folder)
```
[1] Aaron Segal, Antonio Marcedone, Benjamin Kreuter, Daniel Ramage, H. Brendan McMahan, Karn Seth, Keith Bonawitz, Sarvar Patel, Vladimir Ivanov. **Practical Secure Aggregation for Privacy-Preserving Machine Learning**, The 24th ACM Conference on Computer and Communications Security (**CCS**), 2017
# Permission is hereby granted, free of charge, to any person obtaining a copy of this software and
# associated documentation files (the "Software"), to deal in the Software without restriction,
# including without limitation the rights to use, copy, modify, merge, publish, distribute
# sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in all copies or
# substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT
# NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
# DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
#
"""
decorators declares some decorators that ensure the object has the
correct keys declared when need be.
"""
defrequires_private_key(func):
"""
Decorator for functions that require the private key to be defined.
"""
deffunc_wrapper(self,*args,**kwargs):
ifhasattr(self,"private_key"):
func(self,*args,**kwargs)
else:
self.generate_private_key()
func(self,*args,**kwargs)
returnfunc_wrapper
defrequires_public_key(func):
"""
Decorator for functions that require the public key to be defined. By definition, this includes the private key, as such, it's enough to use this to effect definition of both public and private key.
exceptions is responsible for exception handling etc.
"""
classMalformedPublicKey(BaseException):
"""
The public key is malformed as it does not meet the Legendre symbol criterion. The key might have been tampered with or might have been damaged in transit.
"""
def__str__(self):
return"Public key malformed: fails Legendre symbol verification."
classRNGError(BaseException):
"""
Thrown when RNG could not be obtained.
"""
def__str__(self):
return"RNG could not be obtained. This module currently only works with Python 3."