Sledovať
Stanislav Fort
Stanislav Fort
Google DeepMind / Stability AI / Anthropic / Stanford University / Google Brain
Overená e-mailová adresa na: stanford.edu - Domovská stránka
Názov
Citované v
Citované v
Rok
Deep Ensembles: A Loss Landscape Perspective
S Fort, H Hu, B Lakshminarayanan
arXiv preprint arXiv:1912.02757, 2019
5432019
Dawn Drain, Stanislav Fort, Deep Ganguli, Tom Henighan, et al. Training a helpful and harmless assistant with reinforcement learning from human feedback
Y Bai, A Jones, K Ndousse, A Askell, A Chen, N DasSarma
arXiv preprint arXiv:2204.05862 1, 2022
513*2022
Constitutional AI: Harmlessness from AI Feedback
Y Bai, S Kadavath, S Kundu, A Askell, J Kernion, A Jones, A Chen, ...
arXiv preprint arXiv:2212.08073, 2022
4322022
Exploring the limits of out-of-distribution detection
S Fort, J Ren, B Lakshminarayanan
Advances in Neural Information Processing Systems 34, 7068-7081, 2021
2322021
Training independent subnetworks for robust prediction
M Havasi, R Jenatton, S Fort, JZ Liu, J Snoek, B Lakshminarayanan, ...
arXiv preprint arXiv:2010.06610, 2020
1662020
Red teaming language models to reduce harms: Methods, scaling behaviors, and lessons learned
D Ganguli, L Lovitt, J Kernion, A Askell, Y Bai, S Kadavath, B Mann, ...
arXiv preprint arXiv:2209.07858, 2022
1522022
The Break-Even Point on Optimization Trajectories of Deep Neural Networks
S Jastrzebski, M Szymczak, S Fort, D Arpit, J Tabor, K Cho, K Geras
arXiv preprint arXiv:2002.09572, 2020
1362020
Deep learning versus kernel learning: an empirical study of loss landscape geometry and the time evolution of the neural tangent kernel
S Fort, GK Dziugaite, M Paul, S Kharaghani, DM Roy, S Ganguli
Advances in Neural Information Processing Systems 33, 5850-5861, 2020
1312020
Predictability and surprise in large generative models
D Ganguli, D Hernandez, L Lovitt, A Askell, Y Bai, A Chen, T Conerly, ...
Proceedings of the 2022 ACM Conference on Fairness, Accountability, and …, 2022
1302022
A Simple Fix to Mahalanobis Distance for Improving Near-OOD Detection
J Ren, S Fort, J Liu, AG Roy, S Padhy, B Lakshminarayanan
arXiv preprint arXiv:2106.09022, 2021
1252021
Language models (mostly) know what they know
S Kadavath, T Conerly, A Askell, T Henighan, D Drain, E Perez, ...
arXiv preprint arXiv:2207.05221, 2022
1012022
Gaussian Prototypical Networks for Few-Shot Learning on Omniglot
S Fort
arXiv preprint arXiv:1708.02735, 2017
922017
Large Scale Structure of Neural Network Loss Landscapes
S Fort, S Jastrzebski
arXiv preprint arXiv:1906.04724, 2019
732019
Stiffness: A new perspective on generalization in neural networks
S Fort, PK Nowak, S Jastrzebski, S Narayanan
arXiv preprint arXiv:1901.09491, 2019
692019
Discovery of gamma-ray pulsations from the transitional redback PSR J1227-4853
TJ Johnson, PS Ray, J Roy, CC Cheung, AK Harding, HJ Pletsch, S Fort, ...
The Astrophysical Journal 806 (1), 91, 2015
582015
Adaptive quantum state tomography with neural networks
Y Quek, S Fort, HK Ng
arXiv preprint arXiv:1812.06693, 2018
562018
The goldilocks zone: Towards better understanding of neural network loss landscapes
S Fort, A Scherlis
Proceedings of the AAAI Conference on Artificial Intelligence 33 (01), 3574-3581, 2019
412019
Emergent properties of the local geometry of neural loss landscapes
S Fort, S Ganguli
arXiv preprint arXiv:1910.05929, 2019
382019
Measuring progress on scalable oversight for large language models
SR Bowman, J Hyun, E Perez, E Chen, C Pettit, S Heiner, K Lukošiūtė, ...
arXiv preprint arXiv:2211.03540, 2022
332022
Analyzing monotonic linear interpolation in neural network loss landscapes
J Lucas, J Bae, MR Zhang, S Fort, R Zemel, R Grosse
arXiv preprint arXiv:2104.11044, 2021
27*2021
Systém momentálne nemôže vykonať operáciu. Skúste to neskôr.
Články 1–20