new

Get trending papers in your email inbox!

Subscribe

Daily Papers

byAK and the research community

Jan 7

CURVALID: Geometrically-guided Adversarial Prompt Detection

Adversarial prompts capable of jailbreaking large language models (LLMs) and inducing undesirable behaviours pose a significant obstacle to their safe deployment. Current mitigation strategies rely on activating built-in defence mechanisms or fine-tuning the LLMs, but the fundamental distinctions between adversarial and benign prompts are yet to be understood. In this work, we introduce CurvaLID, a novel defense framework that efficiently detects adversarial prompts by leveraging their geometric properties. It is agnostic to the type of LLM, offering a unified detection framework across diverse adversarial prompts and LLM architectures. CurvaLID builds on the geometric analysis of text prompts to uncover their underlying differences. We theoretically extend the concept of curvature via the Whewell equation into an n-dimensional word embedding space, enabling us to quantify local geometric properties, including semantic shifts and curvature in the underlying manifolds. Additionally, we employ Local Intrinsic Dimensionality (LID) to capture geometric features of text prompts within adversarial subspaces. Our findings reveal that adversarial prompts differ fundamentally from benign prompts in terms of their geometric characteristics. Our results demonstrate that CurvaLID delivers superior detection and rejection of adversarial queries, paving the way for safer LLM deployment. The source code can be found at https://github.com/Cancanxxx/CurvaLID

  • 4 authors
·
Mar 5, 2025

Higgs-Induced Gravitational Waves: the Interplay of Non-Minimal Couplings, Kination and Top Quark Mass

We explore a minimal scenario where the sole Standard-Model Higgs is responsible for reheating the Universe after inflation, produces a significant background of gravitational waves and maintains the full classical stability of the electroweak vacuum. As the Higgs self-coupling runs toward negative values at high energy scales, a non-minimal interaction with curvature during a stiff background expansion era drives the Higgs fluctuations closer to the instability scale. This curvature-induced tachyonic instability leads to an intense production of Higgs particles, accompanied by a stochastic gravitational-wave background. The characteristic features of such signal can be directly correlated to the inflationary scale, the non-minimal coupling parameter and the top quark Yukawa coupling. We distinguish between three possible scenarios: absolute stability with low top quark masses, potential vacuum instability, and absolute stability with new physics above the instability scale. Our findings suggest that the detection of a peaked background of gravitational waves together with its inflationary tail has the potential to unveil the features of the Higgs effective potential at very high energy scales while providing a minimal explanation for the reheating phase and the emergence of the Standard-Model plasma in the early Universe. Unlike other studies in the literature, the generation of gravitational waves in our scenario does not depend on the quantum instability of the Standard Model vacuum.

  • 2 authors
·
Feb 6, 2025

More on the Weak Gravity Conjecture via Convexity of Charged Operators

The Weak Gravity Conjecture has recently been re-formulated in terms of a particle with non-negative self-binding energy. Because of the dual conformal field theory (CFT) formulation in the anti-de Sitter space the conformal dimension Delta (Q) of the lowest-dimension operator with charge Q under some global U(1) symmetry must be a convex function of Q. This property has been conjectured to hold for any (unitary) conformal field theory and generalized to larger global symmetry groups. Here we refine and further test the convex charge conjecture via semiclassical computations for fixed charge sectors of different theories in different dimensions. We analyze the convexity properties of the leading and next-to-leading order terms stemming from the semiclassical computation, de facto, extending previous tests beyond the leading perturbative contributions and to arbitrary charges. In particular, the leading contribution is sufficient to test convexity in the semiclassical computations. We also consider intriguing cases in which the models feature a transition from real to complex conformal dimensions either as a function of the charge or number of matter fields. As a relevant example of the first kind, we investigate the O(N) model in 4+epsilon dimensions. As an example of the second type we consider the U(N)times U(M) model in 4-epsilon dimensions. Both models display a rich dynamics where, by changing the number of matter fields and/or charge, one can achieve dramatically different physical regimes. We discover that whenever a complex conformal dimension appears, the real part satisfies the convexity property.

  • 5 authors
·
Sep 10, 2021