Half Field Offense (HFO) is a multi-agent task that is a part of RoboCup soccer. As apparent from the video below, it is a simplified version of soccer played on one half of the soccer field between an offense team and a defense team. Each coloured circular disk is a soccer player, controlled by a separate client program. The clients connect and communicate with a server, periodically getting snapshots of their state and specifying actions. The server implements the physical simulation, and applies the rules of soccer. The HFO simulator allows us to simulate and evaluate several offensive/defensive strategies involving teams of different sizes.
(Video source: https://www.cse.iitb.ac.in/~shivaram/buffer/31876/3v3.mp4.)
In this assignment, we shall focus solely on the behaviour of an offense player when it is in possession of the ball. When not in possession of the ball, offense players play a fixed strategy. The defense players, too, follow a fixed, programmed, strategy. When the offense team does not have possession of the ball, at least one player goes straight to the ball in order to get possession. What must it do once it has possession?
The player represents its state through features such as distances and angles between itself and other players, as well as with static objects such as the goal. The actions available to the player are DRIBBLE, SHOOT, and PASS(k), wherein k is the index of an offense teammate (if there are any). Your task is precisely to code the mapping between state and action: such a mapping is called a policy.
Detailed descriptions of the state and action spaces are mentioned in the sections below. A sequence of steps is provided to get you up and running. You will submit agents for playing 1 versus 1 (1v1) and 2 versus 2 (2v2) HFO.
Here are some of the relevant command line options for running HFO.
You can kill simulations by pressing CTRL+c in the terminal running the server.
The agent's state is represented using a feature vector, which consists of 10 + 6T + 3O floating point numbers, where T is the number teammates (one less than the number of offense players), and O is the number of opponents. The features are listed below in sequence.
Index | Feature | Description |
0 | x position | Agent's global x coordinate |
1 | y position | Agent's global y coordinate |
2 | Orientation | The global direction the agent is facing |
3 | Ball x position | Ball's global x coordinate |
4 | Ball y position | Ball's global y coordinate |
5 | Able to Kick | Boolean indicating if the agent can kick the ball |
6 | Goal centre proximity | Distance between agent and goal centre |
7 | Goal centre angle | Angle between x axis and line connecting agent to goal centre |
8 | Goal opening angle | The magnitude of the largest open angle (between opponents) of the agent to the goal |
9 | Proximity to opponents | Proximity to the closest opponent |
10–(10+T-1) | Teammate goal opening angle | For each teammate, its goal opening angle |
(10+T)–(10+2T-1) | Teammate proximity to opponent | For each teammate, its proximity to its closest opponent |
(10+2T)–(10+3T-1) | Teammate pass opening angle | For each teammate, the magnitude of the open angle available to pass the ball to the teammate |
(10+3T)–(10+6T-1) | Teammate global x position, teammate global y position, teammate uniform number | Three consecutive features for each teammate: global x position, global y position, and uniform number |
(10+6T)–(10+6T+3O-1) | Opponent global x position, opponent global y position, opponent uniform number | Three consecutive features for each opponent: global x position, global y position, and uniform number |
All non-boolean features except the uniform numbers are normalized to the range [-1, 1]. It is possible that occasionally a feature comes marked as "invalid". If so it is given a value of -2.
The following are the actions available to the agent. These are all "high-level" actions, which are implemented by the agent using lower-level skills (which we need not bother with for this assignment).
Observe that for 1v1 HFO, the number of state features is 13 and the number of actions is 2 (PASS is not valid). For 2v2 HFO, the number of state features is 19 and the number of actions is 3. The policies you program need not use all the features available; take some time to think which ones could matter the most.
This task is to familiarise you with the simulation, the state
variables, and the task facing the offense player with the
ball. Specifically, look at the
Use the
In this task, you have to program a 1v1 offense agent to maximise
its scoring percentage. Revise
the
To check performance of your agent, run
In this task, you have to program a 2v2 offense agent to maximise
the scoring percentage of the offense team. Revise
the
To check performance of your agent, run
You are likely to find the
Your agents will be evaluated by calling the corresponding autograder scripts.
The 1v1 agent will be evaluated for 4 marks. It will be given 4 marks if its goal scoring percentage exceeds 75%, otherwise 3 marks if it exceeds 60%, otherwise 2 marks if it exceeds 50%, otherwise 0 marks. Extra credit of 2 marks will be awarded if the agent's goal scoring percentage exceeds 95%.
The 2v2 agent will be evaluated for 6 marks. It will be given 6 marks if its goal scoring percentage exceeds 60%, otherwise 4 marks if it exceeds 50%, otherwise 2 marks if it exceeds 40%, otherwise 0 marks. Extra credit of 2 marks will be awarded if the agent's goal scoring percentage exceeds 75%.