Advanced Features¶

This page is a loose collection of various more advanced features of the problem creation process.

Submodels¶

So far we've just used tuples to group multiple pieces of data together. For example, we defined an edge as just a tuple of two vertices. This works great for very simple types when it's clear what each element of the tuple means, but can become very confusing quickly. Let's say we want to define a problem where rectangles are placed in a 2D coordinate system. These are then defined by four integers: width, height, and x and y position. We could now define the instances like this

class Instance(InstanceModel):

    rectangles: list[tuple[int, int, int, int]]

but we, and more importantly our students, will then have to always remember the order we put those numbers in. To prevent bugs caused by this we can also define a helper class that inherits from BaseModel in algobattle.util. This will then not have the instance or solution specific stuff added, but will also allow us to create json validation specifications just like in those classes.

from algobattle.util import BaseModel


class Rectangle(BaseModel):

    x: int
    y: int
    width: int
    height: int


class Instance(InstanceModel):

    rectangles: list[Rectangle]

Warning

The Pydantic package also exports a class called BaseModel which offers similar functionality. Always inherit from the class Algobattle provides since it includes additional settings that are important for everything to function correctly.

Pydantic then expects a json object at the places where you use these types with keys and values matching the attributes found in the class. For example, a valid instance json file for the above class can look like this:

instance.json

{
    "rectanlges": [
        {
            "x": 3,
            "y": 2,
            "width": 5,
            "height": 2
        },
        {
            "x": 0,
            "height": 17
            "width": 5,
            "y": -2,
        }
    ]
}

External Dependencies¶

You may want to import some additional python libraries to use in your problem file, such as networkx when implementing a more complicated graph problem. To do this we not only have to add the import statement in the problen.py file, but also include it in the list of dependencies in the project config.

algobattle.toml

[match]
problem = "Some Graph Problem"

[problem]
dependencies = [
    "networkx",
]

When initializing the project Algobattle will then make sure that the needed libraries are installed on the user's machine.

Hiding Parts of the Instance¶

Sometimes we want the generator to give us some data we need to verify the instance, but don't want the solver to see this. For example, consider the problem of finding the longest path in a graph with a vertex cover (1) of a specific size. The best way to verify that the instance graph actually contains such a vertex cover is to have it be part of the instance data. But we want the solver to only have access to the information that it exists, not which exact vertices are in it.

A vertex cover is a subset of vertices such that all other vertices in the graph have an edge to a vertex in it.

We can use this using the Field construct from Pydantic. It is a simple marker object that lets us configure various properties of specific fields. One of these settings is the exclude key, which tells Pydantic to exclude this field when serializing the object into json. It will still be parsed and validated normally when reading json data and creating the Python object. We can use it either as the default value of the attribute, or as Annotated[...] metadata.

Example

In this class

from pydantic import Field

class Instance(InstanceModel):
    """An instance of the Some Example problem."""

    normal_attribute: int
    hidden: float = Field(exclude=True)
    also_hidden: Annotated[str, Field(exclude=True)]

the first attribute normal_attribute will be the only that is included in the output that the solver sees. All three attributes are required to be in the instance data the generator creates and will be available on the Python object.

Tip

The Field specifier lets you do many more things than this! Read the excellently written Pydantic documentation to see more use cases.

Using the Algobattle Graph Classes¶

Since many problems use graphs as the foundation of their instances we provide several utility classes to make working with these easier. These classes are DirectedGraph, UndirectedGraph, EdgeWeights, and VertexWeights. Using these classes also ensures that multiple graph problems use the same formats and students won't have to worry about changing any boilerplate code.

The easiest use case is to just inherit from DirectedGraph or UndirectedGraph instead of InstanceModel. Your instance class will then behave the same as if it also included the num_vertices and edges keys which hold a single number specifying the number of vertices in the graph (numbered 0 through n-1) and a list of tuples of such vertices respectively. They will also ensure proper validation, with DirectedGraph accepting any graph and UndirectedGraph accepting only those that contain no self loops and where edges are interpreted as being directionless. Both graph's size is the number of vertices in it.

Reachability

An implementation of the Reachability (1) problem's instance class might look something like this:

Given a graph and two vertices in it, is there a path between them?

class Instance(DirectedGraph):
    """An instance of a Reachability problem."""

    start: Vertex
    end: Vertex

Which is equivalent to

class Instance(InstanceModel):
    """An instance of a Reachability problem."""

    num_vertices: u64
    edges: Annotated[list[tuple[SizeIndex, SizeIndex]], UniqueItems]

    start: Vertex
    end: Vertex

    @property
    def size(self) -> int:
        """A graph's size is the number of vertices in it."""
        return self.num_vertices

Associated Annotation Types

As you can see in the example above, we also provide several types that are useful in type annotations of graph problems such as Vertex, Edge, or Path. How these function is explained in more detail in the advanced annotations section.

If you want the problem instance to also contain additional information associated with each vertex and/or each edge you can use the VertexWeights and EdgeWeights mix ins. These are added as an additional parent class and must be indexed with the type of the weights you want to use.

Labelled Vertices and Weighted Edges

Say your problem wants to label each vertex with the name of a city and each edge with the distance it represents. This would be done like this:

class Instance(DirectedGraph, VertexWeights[str], EdgeWeights[float]):

    ...

Both are encoded as lists of the weights where the nth entry corresponds to the weight of the nth vertex or edge. I.e. the above is equivalent to

class Instance(DirectedGraph):

    vertex_weights: Annotated[list[str], SizeLen]
    edge_weights: Annotated[list[float], EdgeLen]

    ...

Tip

These classes also contain some utility methods to easily perform common graph operations. For example, UndirectedGraph.edge_set contains all edges in both directions, and the neighbors methods lets you quickly access a vertex's neighbours.

Comparing Floats¶

Abstract

This is a somewhat esoteric section that is not strictly needed to use Algobattle. If you're interested in the details this is perfect for you, but the important takeaway for everyone is that we recommend everyone to use the LaxComp class or lax_comp function when working with floats.

We use floats to represent real numbers, but these are limited to a certain precision (64 bits). This can lead to annoying corner cases, finicky bugs, and teams being encouraged to put more energy into running into these than actually solving the problem. For example, the equality 0.1 + 0.1 + 0.1 == 0.3 does not actually hold when using floats. If a problem instance makes use of floats students can then use these inaccuracies to specifically craft instances that can only be solved when carefully keeping track of your exact arithmetical operations and the inaccuracies they introduce. Most of the time this is not actually what we want the focus of a problem to be on, so we'd rather students just ignore these corner cases when working with floats and treat them as close to actual real numbers as possible.

Normally we do this by never comparing strict equality between float values and instead just checking if they are close "enough" for our use case. This is not enough for us since teams would then just create corner case instances that rely on the exact error bound we use. The solution then is to allow the solving team to introduce bigger errors than the generating team. That means that the generating team cannot create instances right at the edge of precision errors since the solver's solutions will be validated with bigger allowable errors.

This sounds complicated, but we've already implemented the hard bits in a utility class and function for you. All you need to keep in mind is that whenever you would be doing a comparison involving equality (==, <=, or >=) to instead use LaxComp or lax_comp from the algobattle.types module. The first acts as a wrapper that will then safely perform these comparisons and the second performs the comparison immediately.

Example

Here's several usage examples assuming the role variable contains the role of the team we're currently validating something for:

LaxComp(0.1 + 0.1 + 0.1, role) == 0.3
LaxComp(0.2, role) <= 0.3
lax_comp(0.1 + 0.1 + 0.1, "==", 0.3 role)
lax_comp(0.300001, ">=", 0.3 role)

The margins for error these introduce are absolutely tiny for all normal applications, about 10¹⁴ times smaller than the values that are being compared, so they can be safely ignored by the application logic. But they are big enough to cover any normal errors introduces by float arithmetic and thus make it safe to naively use them as though they represent actual real numbers.

Problems Without Witnesses¶

Most problems require the generator to not only create an instance, but also provide a solution for it. But we can also create problems that expect only the instance in the generator's output. To do this, just set the corresponding argument in the Problem constructor to False.

Problem(
    name="Instance Only Problem",
    min_size=17,
    instance_cls=Instance,
    solution_cls=Solution,
    with_solution=False,
)

Custom Scoring Functions¶

In a match each fight (one execution of the generator followed by the other team's solver) is assigned a score. This is a number in [0, 1] that indicates how well the solver did. Normally it is computed by just dividing the solver's solution score by the generator's (or vice versa, if the goal is to minimize the score). This matches the usual notion of approximation ratios.

Example

When using the Independent Set (1) problem the generator created a solution with an independent set of size 5. The solver found one of size 4. Since the goal here is to maximize the solution size the score of this fight would be 0.8. But for Vertex Cover (2) the objective is to find the smallest vertex cover, so if the solver found one of size 20 and the generator of size 17, the overall score would be roughly 1.18.

Given a graph, find the biggest subset of its vertices that have no edge between any pair of them.
Given a graph, find the smalles subset of vertices such that every other vertex has an edge to one in that set.

But this isn't always what we want to do. Consider the problem of finding the three shortest paths between two given vertices in a graph, formalized like this:

class Instance(DirectedGraph):
    """Instances of the 3 Shortest Paths problem."""

    start: Vertex
    end: Vertex


class Solution(SolutionModel):
    """Solutions of the 3 Shortest Paths problem."""

    paths: tuple[Path, Path, Path] # (2)!

    @minimize
    def score(self, role: Role) -> float:
        ...

Problem(
    name="3 Shortest Paths",
    min_size=5,
    instance_cls=Instance,
    solution_cls=Solution,
)

For clarity, we omit the imports here.
You'd need additional validation logic here, but we now want to focus on just the score function.

How do we best implement the score function? We could just add the lengths of all the paths, or just pick the length of the shortest or longest one. But really, what we want is for the final score to not just compare a single number for each solution, but each path individually.

We can achieve this by providing a custom scoring function to the Problem constructor. This just is a function that directly receives the instance and both solutions and returns a float in [0, 1]. When we do this, we can drop the score method in the Solution class entirely.

class Instance(DirectedGraph):
    """Instances of the 3 Shortest Paths problem."""

    start: Vertex
    end: Vertex


class Solution(SolutionModel):
    """Solutions of the 3 Shortest Paths problem."""

    paths: tuple[Path, Path, Path]


def compare_each_path(
    instance: Instance,
    generator_solution: Solution,
    solver_solution: Solution
) -> float:
    gen_lens = sorted(len(path) for path in generator_solution.paths) # (1)!
    sol_lens = sorted(len(path) for path in solver_solution.paths)
    ratios = [len(gen) / len(sol) for gen, sol in zip(gen_lens, sol_lens)] # (2)!
    ratios = [min(1, max(0, num)) for num in ratios] # (3)!
    return sum(ratios) / 3 # (4)!

Problem(
    name="3 Shortest Paths",
    min_size=5,
    instance_cls=Instance,
    solution_cls=Solution,
    score_function=compare_each_path,
)

Get each path's length and sort them.
Compute the ratio of each corresponding pair of paths.
Clamp the ratios to be in [0, 1].
Return the average of the ratios.

Test Instances¶

When running algobattle test the CLI tool first tries to build and run the generator and then the solver. But in order to be able to test run the solver, we need to provide it with an input instance. This means that if your generator does not run successfully you also cannot test your solver. To prevent this issue, we can provide a simple test instance when defining the problem. It will then be passed to the solver instead.

Problem(
    name="Pairsum",
    min_size=4,
    instance_cls=Instance,
    solution_cls=Solution,
    test_instance=Instance(numbers=[1, 2, 3, 4]),
)

Make sure to pass a valid instance

This instance will not undergo the usual validation step and does not come with a solution. This means you can accidentally provide a test instance which can't actually be solved correctly.