Large Language Models (LLMs) are often viewed through the lens of deep learning architectures and large-scale computation. This article argues that, at their core, LLMs are statistical models grounded in probability, likelihood-based estimation, and information theory. We present a unified statistical interpretation of language modelling as conditional density estimation over discrete sequences, which attempts to predict the next word, given its context. We trace the evolution of this paradigm from classical n-gram models, rooted in Markov probability structures, to modern transformer-based architectures that go beyond fixed-order dependence.
Key components of LLMs - including attention, contextual embeddings, and decoding strategies- are interpreted using familiar statistical concepts such as adaptive weighting, sufficient representations, and stochastic sampling. Training and fine-tuning procedures are analysed within the framework of statistical learning theory, highlighting connections to cross-entropy minimisation, regularisation, generalisation, and information-theoretic trade-offs. By reframing LLMs as large-scale statistical estimators rather than symbolic reasoning systems, the paper seeks to connect modern language models with classical statistical ideas and to motivate further theoretical work at their intersection.
Topological semantics for a non-self-extensional LFI
In this talk, a new Logic of Formal Inconsistency (LFI), that we call vD (Vague Disjunction), will be introduced. The name is owed to the fact that the disjunction in vD can only be partially determined - this is unlike other LFIs. This logic is non-self-extensional, i.e., the replacement property, or the rule for substitution of equivalents, does not hold. A Hilbert-style presentation for the logic will be given. Then, a topological semantics for vD will be described, with respect to which it is sound and complete.
Copula-based information measures and dependence in bivariate distributions
We propose extropy measures based on the density copula, distributional copula, and survival copula, and then explore their properties. We establish connections between cumulative copula extropy and three dependence measures: Spearman’s rho, Kendall’s tau, and Blest’s measure of rank correlation. The semiparametric estimation of the cumulative copula extropy has been introduced. Furthermore, a Monte Carlo simulation study has been carried out for the illustration purposes.
A k-coloring of a graph is an assignment of k colors to its vertices such that no two adjacent vertices receive the same color. The COLORING PROBLEM is the problem of determining the smallest k such that the graph admits a k-coloring. Given a set L of graphs, a graph G is L-free if G does not contain any graph in L as an induced subgraph. The complexity of the COLORING PROBLEM for L-free graphs is known (NP-complete or polynomial-time solvable) whenever L contains a single graph. There has been keen interest in coloring graphs whose forbidden list L contains basic graphs such as induced paths, induced cycles and their complements. In this talk, I will provide a survey of recent progresson this topic.
Atotal colouringof a graph is an assignment of colours to the vertices and edges of a graph that satisfies the following properties: (i) Any two adjacent vertices have different colours, (ii) Any two edges incident on a common vertex have different colours,and (iii) An edge has a colour different from its two endpoints.
Thus, restricting a total colouring to the vertex set of the graph gives a proper vertex colouring of the graph, and restricting it to the edge set of the graph gives a proper edge colouring of the graph. The famousTotal Colouring Conjecture, first stated in the 1960s independently by Behzad and Vizing, asserts that every graph having maximum degree ? has a total colouring using at most ?+2 colours. Given that this conjecture has resisted attempts to solve it for so many years, it makes sense to study weaker versions of it. The WeakTotal Colouring Conjecturerelaxes the bound required on the number of colours by 1, that is, it states that every graph having maximum degree ? has a total colouring using at most ?+3 colours. The Weak Total Colouring Conjecture is known to be true for 4- colourable graphs (a graph is k-colourable if the vertices can be given colours from a set of k colours in such a way that no two adjacent vertices get the same colour). We show how to extend this proof to make it work for 5-colourable graphs.