Projects

The goal of the project should be to create something, which is actually useful. Therefore we offer a lot of freedom in how the project will look like with the condition that you should spent around 60 hours on it (this number was derived as follows: each credit is worth 30 hours minus 13 lectures + labs minus 10 homeworks 2 hours each) and you should demonstrate some skills in solving the project. In general, we can distinguish three types of project depending on the beneficiary:

You benefit: Use / try to solve a well known problem using Julia language,
Our group: work with your tutors on a topic researched in the AIC group,
Julia community: choose an issue in a registered Julia project you like and fix it (documentation issues are possible but the resulting documentation should be very nice.).

The project should be of sufficient complexity that verify your skill of the language (to be agreed individually).

Below, we list some potential projects for inspiration.

Implementing new things

Lenia (Continuous Game of Life)

Lenia is a continuous version of Conway's Game of Life. Try to implement a Julia version and try to make it faster than python version. You can also try to build nice visualizations with Makie.jl.

Nice tutorial from Conway to Lenia

Game of Life

Implement differentiable Conway's Game of Life. While the problem is in its essence discrete, there exist (a biased) approximation of the gradient explained here. Try to port it from Python to Julia with a nice visualization. Try to make it fast, potentially utilizing GPU.

Architecture visualizer

Create an extension of Flux / Lux and to visualize architecture of a neural network suitable for publication. Something akin PlotNeuralNet.

Learning Large Language Models with reduced precition (Mentor: Tomas Pevny)

Large Language Models ((Chat) GPT, LLama, Falcon, Palm, ...) are huge. A recent trend is to perform optimization in reduced precision, for example in int8 instead of Float32. Such feature is currently missing in Julia ecosystem and this project should be about bringing this to the community (for an introduction, read these blogs LLM-int8 and emergent features, A gentle introduction to 8-bit Matrix Multiplication). The goal would be to implement this as an additional type of Number / Matrix and overload multiplication on CPU (and ideally on GPU) to make it transparent for neural networks? What I will learn? In this project, you will learn a lot about the (simplicity of) implementation of deep learning libraries and you will practice abstraction of Julia's types. You can furthermore learn about GPU Kernel programming and Transformers.jl library.

Planning algorithms (Mentor: Tomas Pevny)

Extend SymbolicPlanners.jl with the mm-ϵ variant of the bi-directional search MM: A bidirectional search algorithm that is guaranteed to meet in the middle. This pull request might be very helpful in understanding better the library.

A Rule Learning Algorithms (Mentor: Tomas Pevny)

Rule-based models are simple and very interpretable models that have been around for a long time and are gaining popularity again. The goal of this project is to implement one of these algorithms

sequential covering

algorithm called RIPPER and evaluate it on a number of datasets.

To increase the impact of the project, consider interfacing it with MLJ.jl

Parallel optimization (Mentor: Tomas Pevny)

Implement one of the following algorithms to train neural networks in parallel. Can be implemented in a separate package or consider extending FluxDistributed.jl. Do not forget to verify that the method actually works!!!

Solve issues in existing projects:

Address issues in HMil/JsonGrinder library (Mentor: Simon Mandlik)

These are open source toolboxes originally develuped at CTU and then made production quality in Avast Lots of general functionality is done, but there is always space for love and polishing. Look at issues at Mill.jl or JsonGrinder.jl and solve some of them.

Study porting adapting Mill.jl to GPU (Mentor: Simon Mandlik)

Mill.jl is a library which is designed to do machine learning over tree-structured data stored in formats like JSON, ProtoBuffer, etc. Try to port the library to run on GPU. The interesting part is schedulling, because the lowest parts can be executed in parallel. Study, if GPU scheduller will reorder operations in order this parallelization happens even though CPU code executes in depth-first order. If not, propose how to change the code.

Project requirements

The goal of the semestral project is to create a Julia pkg with reusable, properly tested and documented code. We have given you some options of topics, as well as the freedom to choose something that could be useful for your research or other subjects. In general we are looking for something where performance may be crucial such as data processing, optimization or equation solving.

In practice the project should roughly follow the structure below:

.
├── scripts
│	├── run_example.jl			# one or more examples showing the capabilities of the pkg
│	├── Project.toml 			# YOUR_PROJECT should be added here with develop command with rel path
│	└── Manifest.toml 			# should be committed as it allows to reconstruct the environment exactly
├── src
│	├── YOUR_PROJECT.jl 		# ideally only some top level code such as imports and exports, rest of the code included from other files
│	├── src1.jl 				# source files structured in some logical chunks
│	└── src2.jl
├── test
│	├── runtest.jl              # contains either all the tests or just includes them from other files
│	├── Project.toml  			# lists some additional test dependencies
│	└── Manifest.toml   		# usually not committed to git as it is generated on the fly
├── docs
│   ├── Project.toml
│   ├── make.jl
│   └── src
│       └── index.md
├── README.md 					# describes in short what the pkg does and how to install pkg (e.g. some external deps) and run the example
├── Project.toml  				# lists all the pkg dependencies
└── Manifest.toml  				# usually not committed to git as the requirements may be to restrictive

Make sure that

README.md is present and contains general information about the package. A small example is a nice to have.
The package can be installed trough the package manager as Pkg.add("url of the package") with all and correct dependencies. Do not register the package into an official registry if you are not willing to continue its development and maintainance.
Make sure that the package is covered by tests which are in the test folder. We will try to run them. There is no need for 100% percent test coverage. Tests testing the functionality are sufficient.
The package should have basic documentation. For small packages, it is sufficient to have documentation in readme. For larger pacakges, proper documentation with Documenter.jl is advised.

Only after all this we may look at the extent of the project and it's difficulty, which may help us in deciding between grades.

Nice to have things, which are not strictly required but obviously improves the score.

Ideally the project should be hosted on GitHub, which could have the continuous integration/testing set up.
Include some benchmark and profiling code in your examples, which can show us how well you have dealt with the question of performance.
Some parallelization attempts either by multi-processing, multi-threading, or CUDA. Do not forget to show the improvement.
Documentation with a webpage using Documenter.jl.

Former projects for your inspiration

The following is a list of great projects of past years.

NeuralCollaborativeFiltering.jl
OptimalTrainControl.jl
Urban Traffic Control
Directed Evolution in Silico
ImageInspector.jl (used in our bachelor course)