about the role
Motivations and the subject
The goal of the thesis is to perform research on decentralized and efficient federated AutoML learning for heterogeneous embedded devices.
The training of AI models for service delivery is today facing a conceptual transformation, by shifting the learning of models close to the data, embedded on users' devices.
These devices have limited resources and must remain fully operational during the learning phase. In addition, users today generate sensitive data and new collaborative algorithms for learning need to be developed and optimized for different embedded devices, ranging from smartphones to IoT.
Nowadays, to build an AI model it is necessary to collect data on a central server (cloud). The problems of this method are related to privacy, control of data usage and computational resources.
Federated learning (FL) 1,2 is a new AI approach with collaborative training that resolves these problems. Models are trained on local users’ data and its parameters only are exchanged with other users to build a global model.
The challenges of Federated Learning are (a) obtaining efficient and robust decentralized FL models with heterogeneous data (b) optimizing resources for actual operational deployment and (c) customizing services and optimizing model based on available resources for groups of users, because a single global model may be less explainable, accurate and appropriate when compared to a personalized model.
We will deploy deep neural networks on users’ devices because they have high classification / prediction accuracy in various tasks.
However, their training requires a significant effort in terms of finding optimal hyperparameters, which limits their use at devices with constrained resources.
Emerging areas address the problem of automatic neural network generation 3 and automatic search for appropriate architectures (Neural Architecture Search-NAS), features required for real-world deployments.
FL NAS 4 aims at optimizing the architecture of neural network models in the FL environment. Many questions in this domain remain open.
For example, there are no approaches developed for FL with clients having the same sample space and a different feature space.
Scientific objectives and challenges
The objective of the thesis is to (a) design a federated learning framework to automatically generate low-power neural networks in compliance with GDPR 5 with homogeneous (b) and heterogeneous devices under device constraints (availability, resources, states) and to study it in a fully decentralized Peer-to-Peer federated learning setup.
You have a Masters degree in Data Science or Computer Science and you are a curious person that likes to learn and seek for solutions.
You are highly motivated to do your thesis in the emerging field of distributed algorithms for embedded devices. You have skills in machine learning, optimization and statistics (essential) as well as good programming skills and knowledge in the field of embedded devices (desirable).
Interest in the field of Signal Processing is a plus.
Furthermore, autonomy and open-mindedness are the qualities particularly appreciated for research work. The dynamism, the strength of proposal and the capacities of communication are also required for this position.
English will be used throughout the thesis (reading state of the art, writing articles and presenting results at international conferences) and excellent level of English is therefore required.
1 J. Konecny, H. B. McMahan, F. X. Yu, P. Richtarik, A. T. Suresh and D. Bacon, Federated Learning : Strategies for Improving Communication Efficiency, in arXiv, 2017, pp. 1-10.
2 K. Bonawitz, H. Eichner, W. Grieskamp, D. Huba, A. Ingerman, V. Ivanov, Ch. M. Kiddon, J. Konečný, S. Mazzocchi, B. McMahan, T.
Van Overveldt, D. Petrou, D. Ramage and J. Roselander, Towards Federated Learning at Scale : System Design, SysML 2019, https : / / arxiv.
org / abs / 1902.01046, 2019.
3 A. Wong, M. J. Shafiee, B. Chwyl and F. Li, FermiNets : Learning generative machines to generate efficient neural networks via generative synthesis, 1809.
05989.pdf (arxiv.org), NIPS, 2018.
4 H. Zhu, H. Zhang and Y. Jin, From Federated Learning to Federated Neural Architecture Search, https : / / arxiv.org / pdf / 2009.05868.pdf, 2020.
5 Regulation (EU) 2016 / 679 of the European Parliament and of the Council (article 30), https : / / eur-lex.europa.eu / legal-content / EN / TXT / HTML / ?
uri CELEX : 32016R0679&from EN#d1e3265-1-1, Archived from the original on 28 June 2017.
What is the additional opportunities of this thesis?
The objective of this thesis is to design new decentralized federated learning methods (peer-to-peer) based on client clusters with models that are self-adapting (AutoML) to the available resources.
The goal is to develop federated learning algorithms to train low-power neural networks with heterogeneous equipment and with strong capacity constraints :
Moreover, this thesis will allow you to grasp insights in the field of industrial research. Our multidisciplinary team that you will join is a stimulating and enriching environment to learn and carry out your thesis.
Furthermore, you will join a research ecosystem that proposes and deploys concrete implementations of the studied concepts.
In your thesis you will work on developing algorithms for service proposition for customer premises equipment (for example, smartphones or Boxes) and influencing the new AI telecommunication services in the long term.
Participation in collaborative projects with industrial and academic partners may be possible.
Orange Innovation anticipates technological breakthroughs and supports the Group's countries and entities in making the best technological choices to meet the needs of our consumer and business customers.
With a global vision and a wide range of profiles (researchers, engineers, designers, developers, data scientists, sociologists, graphic designers, marketers, cybersecurity experts, etc.
the men and women of the Innovation Department are devoted to providing personalized solutions to local and foreigner clients, as well as to Business Units, to make Orange a trusted multi-service operator.
We train the experts in the domain and ensure continuous improvement in the performance and efficiency of our services.
With 720 researchers, thousands of marketers, developers, designers and data analysts, it is the expertise of our 6,000 employees that fuels this ambition every day.
Within the Orange Innovation Department, you will join a multidisciplinary team of 24 people (researchers, data scientists, data engineers, developers, .