Nicholas Schiefer

Cited by

	All	Since 2019
Citations	1719	1706
h-index	14	14
i10-index	16	16

980

490

245

735

2019202020212022202320246 7 11 52 962 661

Public access

View all

5 articles

0 articles

available

not available

Based on funding mandates

Co-authors

Zac Hatfield DoddsAnthropic; Australian National UniversityVerified email at anu.edu.au
Carol ChenMember of Technical StaffVerified email at anthropic.com
Jared KaplanJohns Hopkins University & AnthropicVerified email at pha.jhu.edu
Christopher OlahAnthropicVerified email at google.com
Robert LasenbyStanford UniversityVerified email at stanford.edu
Dario AmodeiCEO and Co-Founder at AnthropicVerified email at anthropic.com
Catherine OlssonAnthropicVerified email at mit.edu
Dawn DrainMicrosoftVerified email at microsoft.com
Roger GrosseAssociate Professor, University of TorontoVerified email at cs.toronto.edu
Erik WinfreeCalifornia Institute of TechnologyVerified email at caltech.edu
Shyam NarayananPhD Student, MITVerified email at mit.edu
Piotr IndykProfessor of Electrical Engineering and Computer Science, MITVerified email at mit.edu
Kfir Lev-AriAppleVerified email at alumni.technion.ac.il
Tao LinMeta Platforms, Inc.Verified email at fb.com
Anders AamandUniversity of CopenhagenVerified email at mit.edu
Ronitt RubinfeldProfessor of Computer Science, MIT and Tel Aviv UniversityVerified email at csail.mit.edu
Helen XuGeorgia Institute of TechnologyVerified email at gatech.edu
Daniel JacksonMITVerified email at mit.edu
Geoffrey LittPhD Student, MITVerified email at mit.edu
Alexander Shraer

Nicholas Schiefer

Anthropic

Verified email at mit.edu


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Constitutional ai: Harmlessness from ai feedback Y Bai, S Kadavath, S Kundu, A Askell, J Kernion, A Jones, A Chen, ... arXiv preprint arXiv:2212.08073, 2022	577	2022
Language models (mostly) know what they know S Kadavath, T Conerly, A Askell, T Henighan, D Drain, E Perez, ... arXiv preprint arXiv:2207.05221, 2022	220	2022
Red teaming language models to reduce harms: Methods, scaling behaviors, and lessons learned D Ganguli, L Lovitt, J Kernion, A Askell, Y Bai, S Kadavath, B Mann, ... arXiv preprint arXiv:2209.07858, 2022	210	2022
Toy models of superposition N Elhage, T Hume, C Olsson, N Schiefer, T Henighan, S Kravec, ... arXiv preprint arXiv:2209.10652, 2022	139	2022
Discovering language model behaviors with model-written evaluations E Perez, S Ringer, K Lukošiūtė, K Nguyen, E Chen, S Heiner, C Pettit, ... arXiv preprint arXiv:2212.09251, 2022	123	2022
The capacity for moral self-correction in large language models D Ganguli, A Askell, N Schiefer, TI Liao, K Lukošiūtė, A Chen, A Goldie, ... arXiv preprint arXiv:2302.07459, 2023	92	2023
Towards measuring the representation of subjective global opinions in language models E Durmus, K Nyugen, TI Liao, N Schiefer, A Askell, A Bakhtin, C Chen, ... arXiv preprint arXiv:2306.16388, 2023	58	2023
Towards monosemanticity: Decomposing language models with dictionary learning T Bricken, A Templeton, J Batson, B Chen, A Jermyn, T Conerly, N Turner, ... Transformer Circuits Thread, 2, 2023	49	2023
Measuring progress on scalable oversight for large language models SR Bowman, J Hyun, E Perez, E Chen, C Pettit, S Heiner, K Lukošiūtė, ... arXiv preprint arXiv:2211.03540, 2022	41	2022
Towards understanding sycophancy in language models M Sharma, M Tong, T Korbak, D Duvenaud, A Askell, SR Bowman, ... arXiv preprint arXiv:2310.13548, 2023	35	2023
Measuring faithfulness in chain-of-thought reasoning T Lanham, A Chen, A Radhakrishnan, B Steiner, C Denison, ... arXiv preprint arXiv:2307.13702, 2023	32	2023
Universal Computation and Optimal Construction in the Chemical Reaction Network-Controlled Tile Assembly Model N Schiefer, E Winfree 21st International Conference on DNA Computing and Molecular Programming …, 2015	26	2015
Question decomposition improves the faithfulness of model-generated reasoning A Radhakrishnan, K Nguyen, A Chen, C Chen, C Denison, D Hernandez, ... arXiv preprint arXiv:2307.11768, 2023	25	2023
FoundationDB Record Layer: A Multi-Tenant Structured Datastore C Chrysafis, B Collins, S Dugas, J Dunkelberger, M Ehsan, S Gray, ... Proceedings of the 2019 International Conference on Management of Data, 1787 …, 2019	22	2019
Exponentially improving the complexity of simulating the Weisfeiler-Lehman test with graph neural networks A Aamand, J Chen, P Indyk, S Narayanan, R Rubinfeld, N Schiefer, ... Advances in Neural Information Processing Systems 35, 27333-27346, 2022	14	2022
Superposition, memorization, and double descent T Henighan, S Carter, T Hume, N Elhage, R Lasenby, S Fort, N Schiefer, ... Transformer Circuits Thread, 2023	13	2023
Time Complexity of Computation and Construction in the Chemical Reaction Network-Controlled Tile Assembly Model N Schiefer, E Winfree 22nd International Conference on DNA Computing and Molecular Programming …, 2016	9	2016
A fill estimation algorithm for sparse matrices and tensors in blocked formats P Ahrens, H Xu, N Schiefer 2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS …, 2018	8	2018
Specific versus general principles for constitutional ai S Kundu, Y Bai, S Kadavath, A Askell, A Callahan, A Chen, A Goldie, ... arXiv preprint arXiv:2310.13798, 2023	7	2023
Sleeper agents: Training deceptive llms that persist through safety training E Hubinger, C Denison, J Mu, M Lambert, M Tong, M MacDiarmid, ... arXiv preprint arXiv:2401.05566, 2024	6	2024

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors