Negotiation or bargaining is a well-known procedure in multi-agent systems. In economics, the bargaining problem arises when there is some gain from some trade. The challenge is to divide the gain given that (i) there is a conflict of interest, and (ii) any agreement must be approved by all involved individuals. Each agent follows some strategy in negotiation. The negotiation strategies define the sequence of actions (usually offers, counteroffers, accept, or reject) that the agents take during the entire interaction. Moreover, the agents’ actions should be compatible with a particular negotiation protocol. Often, some information, e.g., about preferences, is available to the negotiating agents. However, if such information is not available, a negotiation procedure can be performed sequentially so that an intelligent agent learns to negotiate optimally over time. In this thesis, the goal is to study and analyze, also potentially develop, negotiation strategies under uncertainty about different factors that affect the negotiation, such as agents’ preferences.
Crowdsourcing is a valuable tool for numerous applications, for example to acquire reliable ratings for services or goods online or to create labeled datasets for use in machine learning with online platforms. Efficient crowdsourcing entails facing several challenges. One crucial challenge is to formally model the decision making of individuals about participating in the micro-task crowdsourcing, where the monetary compensation or any other type of utility serves as incentive. The challenge becomes aggravated for the crowdsourcer when its ability or resources for compensation are limited, and at the same, there are several task to be crowdsourced, potentially with different priorities. In this thesis, one goal is to categorize and study the incentive methods for crowdsourcing. Another goal is to develop incentive mechanisms for efficient crowdsourcing of different tasks with scarce compensation resources.
The ever-increasing demand for media streaming together with limited backhaul capacity renders developing efficient file-delivery methods imperative. One such method is caching, which is enabled by the asynchronous content reuse property of multimedia content. To realize the potential of caching, the most popular contents are saved at suitable locations in the network, and are delivered upon demand. The problems become challenging when there is lack of information about the popularity of the contents, network, and other impactful factors. In this thesis, the goal is to optimize the content caching in peer-to-peer network under a lack of information, using methods from machine learning and artificial intelligence.
In federated learning, several participants (clients) contribute to model development. The participants receive a model and determine its parameters using their local data. Then they send their parameters' update to a central unit. The central unit combines all the updates, for example, by averaging, and develops a new shared model. The iteration continues to guarantee the required model accuracy. To maximize the accuracy of the developed or learned model, federated learning would attempt to benefit from every reliable participant; nevertheless, maximizing the number of participants is often inefficient, for example, due to communication constraints or by financial reasons when the participants receive reimbursement. Therefore, a more efficient solution is to select the best set of participants that satisfies the required constraints.
Multi-armed bandits (MAB) is a sequential decision-making framework in which an agent explores a set of arms with unknown reward distributions. Upon pulling an arm, it receives a reward drawn from that arm’s distribution. In the best arm identification setting, the agent aims at finding the optimal arm with a high probability within a fixed number of rounds or with a predefined confidence level. The problem becomes challenging in some circumstances. For example, sometimes exploring the entire arm set to recommend the best one is infeasible. In the multi-agent setting, the agents can overcome such challenges through collaboration, which is referred to as federated best-arm identification. This thesis includes developing such algorithms for various scenarios and studying their applications in distributed sensing in communication networks.
Multi-armed bandits (MAB) is a sequential decision-making problem in which an agent explores a set of arms with unknown reward distributions to find the optimal one. Upon pulling an arm, it receives a reward drawn from that arm’s distribution. Using the history of actions and rewards, the agent improves future choices to collect more rewards. Sometimes, there is some delay in revealing the reward of the play action, which makes optimal decision-making even more challenging. Additionally, these delayed feedbacks for different arms might be correlated or have another type of interdependency. The thesis investigates the applicability of signal-processing methods to boost online decision-making performance under delayed feedback.
Situational awareness or situation awareness consists of three elements: (i) The perception of environmental elements and events concerning time or space; (ii) Comprehension of the meaning and relation of the perceived events; (iii) Look ahead of the future status by using the obtained knowledge. Although the concept dates back to the nineties, recent research works leverage machine learning and artificial intelligence to enhance and enrich the concept and its associated methods. Besides, situational awareness plays a crucial role in different scenarios of the Internet of Things, which is a highly-dense network consisting of humans, machines, and processes. In this thesis, the goal is to study AI-enabled situational awareness and to investigate its application in IoT-related scenarios.
In the seminal setting of one-shot hide and seek games, a hider selects one out of k locations to hide. A seeker then chooses n among k locations to search for the hider. The seeker’s payoff is the probability that she finds the hider, whereas the hider’s payoff is the probability that she successfully escapes the seeker’s pursuit. So far, the researchers have formulated and studies several variants of such games. Hide and seek games find several applications to model and solve the problems that arise in networked intelligent systems. In this thesis, the goal is to study such games systematically, classify the associated problems and the corresponding state-of-the-art solutions, and discuss the applications.
In reinforcement learning, a fundamental challenge is the curse of dimensionality. In some cases, using the structure of the state- and action space can improve the agent's decision-making performance by increasing the exploration efficiency. This thesis aims to comprehensively study various approaches to modeling a structured state space within the reinforcement learning framework. Besides, it provides in-depth insights into exploiting the structure to estimate the state values. The coding part includes implementing a few related methods and comparing their performances using real-world datasets.
The majority of state-of-the-art research studies an agent that acts with bounded rationality according to a fixed decision-making strategy towards a specific goal, for example, regret minimization in multi-armed bandits or maximizing utility when forming coalitions. However, as decision-makers, humans do not constantly follow a specific strategy and do not always act rationally. Such characteristics include altruism, envy, curiosity, and the like. This thesis studies how humans' features influence their decision-making in specific situations, e.g., when gambling or forming coalitions.