Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

Palisade Research (Collaborators)

Team
non-profit
Verified
https://palisaderesearch.org
palisadeai
PalisadeResearch
Activity Feed

AI & ML interests

None defined yet.

Oleg Serikov's profile picture Adam Gleave's profile picture Nathan's profile picture Chris Cundy's profile picture justin rokisky's profile picture John Mein's profile picture Abigail Haddad's profile picture Jeffrey Ladish's profile picture Ian McKenzie's profile picture Siddharth Narayanan's profile picture Dmitrii's profile picture Cash Costello's profile picture Fedor Ryzhenkov's profile picture Reworr's profile picture Ben Weinstein-Raun's profile picture KB's profile picture Dmitrii (test)'s profile picture Artem's profile picture Akash Wasil's profile picture khaidar bikmaev's profile picture bsrand's profile picture Viktor Petukhov's profile picture Rustem Turtayev's profile picture Nikolaj's profile picture Linda Petrini's profile picture Sergey Koldyba's profile picture John Steidley's profile picture

lindapetrini 
authored a paper 5 months ago

Inverse Scaling in Test-Time Compute

Paper • 2507.14417 • Published Jul 19 • 27
Reworr-R 
updated a dataset 10 months ago

palisaderesearch/LLM-Honeypot-Logs

Updated about 2 hours ago • 45 • 5
dmitrii-palisaderesearch 
authored a paper about 1 year ago

Badllama 3: removing safety finetuning from Llama 3 in minutes

Paper • 2407.01376 • Published Jul 1, 2024
AdamGleave 
authored a paper almost 2 years ago

Exploiting Novel GPT-4 APIs

Paper • 2312.14302 • Published Dec 21, 2023 • 14
jladish 
authored a paper about 2 years ago

LoRA Fine-tuning Efficiently Undoes Safety Training in Llama 2-Chat 70B

Paper • 2310.20624 • Published Oct 31, 2023 • 13
oserikov 
authored a paper about 2 years ago

The Languini Kitchen: Enabling Language Modelling Research at Different Scales of Compute

Paper • 2309.11197 • Published Sep 20, 2023 • 5
AdamGleave 
authored 2 papers over 2 years ago

Adversarial Policies Beat Superhuman Go AIs

Paper • 2211.00241 • Published Nov 1, 2022

Invariance in Policy Optimisation and Partial Identifiability in Reward Learning

Paper • 2203.07475 • Published Mar 14, 2022
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs