Oct 29, 2024 Our arXiv preprint proposes a new neural network layer, the Fourier head, which learns a continuous probability density function using Fourier series, and returns a discrete approximation of it. When to use it? Large language models are often adapted to model non-linguistic tokens. If these tokens have an underlying continuous structure, then replacing the linear classification head with the Fourier head can boost downstream performance. Project page
Oct 28, 2024 I will be attending AGNES at Dartmouth!