Current text generators, such as ChatGPT, are highly unreliable, difficult to use effectively, unable to do many things we might want them to, and extremely expensive to develop and run. These defects are inherent in their underlying technology. Quite different methods could plausibly remedy all these defects. Would that be good, or bad?
https://betterwithout.ai/better-text-generators
John McCarthy's paper "Programs with common sense": http://www-formal.stanford.edu/jmc/mcc59/mcc59.html
Harry Frankfurt, "On Bullshit": https://www.amazon.com/dp/B001EQ4OJW/?tag=meaningness-20
Petroni et al., "Language Models as Knowledge Bases?": https://aclanthology.org/D19-1250/
Gwern Branwen, "The Scaling Hypothesis": gwern.net/scaling-hypothesis
Rich Sutton's "Bitter Lesson": www.incompleteideas.net/IncIdeas/BitterLesson.html
Guu et al.'s "Retrieval augmented language model pre-training" (REALM): http://proceedings.mlr.press/v119/guu20a/guu20a.pdf
Borgeaud et al.'s "Improving language models by retrieving from trillions of tokens" (RETRO): https://arxiv.org/pdf/2112.04426.pdf
Izacard et al., "Few-shot Learning with Retrieval Augmented Language Models": https://arxiv.org/pdf/2208.03299.pdf
Chirag Shah and Emily M. Bender, "Situating Search": https://dl.acm.org/doi/10.1145/3498366.3505816
David Chapman's original version of the proposal he puts forth in this episode: twitter.com/Meaningness/status/1576195630891819008
Lan et al. "Copy Is All You Need": https://arxiv.org/abs/2307.06962
Mitchell A. Gordon's "RETRO Is Blazingly Fast": https://mitchgordon.me/ml/2022/07/01/retro-is-blazing.html
Min et al.'s "Silo Language Models": https://arxiv.org/pdf/2308.04430.pdf
W. Daniel Hillis, The Connection Machine, 1986: https://www.amazon.com/dp/0262081571/?tag=meaningness-20
Ouyang et al., "Training language models to follow instructions with human feedback": https://arxiv.org/abs/2203.02155
Ronen Eldan and Yuanzhi Li, "TinyStories: How Small Can Language Models Be and Still Speak Coherent English?": https://arxiv.org/pdf/2305.07759.pdf
Li et al., "Textbooks Are All You Need II: phi-1.5 technical report": https://arxiv.org/abs/2309.05463
Henderson et al., "Foundation Models and Fair Use": https://arxiv.org/abs/2303.15715
Authors Guild v. Google: https://en.wikipedia.org/wiki/Authors_Guild%2C_Inc._v._Google%2C_Inc.
Abhishek Nagaraj and Imke Reimers, "Digitization and the Market for Physical Works: Evidence from the Google Books Project": https://www.aeaweb.org/articles?id=10.1257/pol.20210702
You can support the podcast and get episodes a week early, by supporting the Patreon: https://www.patreon.com/m/fluidityaudiobooks If you like the show, consider buying me a coffee: https://www.buymeacoffee.com/mattarnold Original music by Kevin MacLeod. This podcast is under a Creative Commons Attribution Non-Commercial International 4.0 License.Analysis of image classifiers demonstrates that it is possible to understand backprop networks at the task-relevant run-time algorithmic level. In these systems, at least, networks gain their power from deploying massive parallelism to check for the presence of a vast number of simple, shallow patterns.
https://betterwithout.ai/images-surface-features
This episode has a lot of links:
David Chapman's earliest public mention, in February 2016, of image classifiers probably using color and texture in ways that "cheat": twitter.com/Meaningness/status/698688687341572096
Jordana Cepelewicz's "Where we see shapes, AI sees textures," Quanta Magazine, July 1, 2019: https://www.quantamagazine.org/where-we-see-shapes-ai-sees-textures-20190701/
"Suddenly, a leopard print sofa appears", May 2015: https://web.archive.org/web/20150622084852/http://rocknrollnerd.github.io/ml/2015/05/27/leopard-sofa.html
"Understanding How Image Quality Affects Deep Neural Networks" April 2016: https://arxiv.org/abs/1604.04004 Goodfellow et al., "Explaining and Harnessing Adversarial Examples," December 2014: https://arxiv.org/abs/1412.6572
"Universal adversarial perturbations," October 2016: https://arxiv.org/pdf/1610.08401v1.pdf
"Exploring the Landscape of Spatial Robustness," December 2017: https://arxiv.org/abs/1712.02779
"Overinterpretation reveals image classification model pathologies," NeurIPS 2021: https://proceedings.neurips.cc/paper/2021/file/8217bb4e7fa0541e0f5e04fea764ab91-Paper.pdf
"Approximating CNNs with Bag-of-Local-Features Models Works Surprisingly Well on ImageNet," ICLR 2019: https://openreview.net/forum?id=SkfMWhAqYQ
Baker et al.'s "Deep convolutional networks do not classify based on global object shape," PLOS Computational Biology, 2018: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1006613
François Chollet's Twitter threads about AI producing images of horses with extra legs: twitter.com/fchollet/status/1573836241875120128 and twitter.com/fchollet/status/1573843774803161090
"Zoom In: An Introduction to Circuits," 2020: https://distill.pub/2020/circuits/zoom-in/
Geirhos et al., "ImageNet-Trained CNNs Are Biased Towards Texture; Increasing Shape Bias Improves Accuracy and Robustness," ICLR 2019: https://openreview.net/forum?id=Bygh9j09KX
Dehghani et al., "Scaling Vision Transformers to 22 Billion Parameters," 2023: https://arxiv.org/abs/2302.05442
Hasson et al., "Direct Fit to Nature: An Evolutionary Perspective on Biological and Artificial Neural Networks," February 2020: https://www.gwern.net/docs/ai/scaling/2020-hasson.pdf
You can support the podcast and get episodes a week early, by supporting the Patreon: https://www.patreon.com/m/fluidityaudiobooks If you like the show, consider buying me a coffee: https://www.buymeacoffee.com/mattarnold Original music by Kevin MacLeod. This podcast is under a Creative Commons Attribution Non-Commercial International 4.0 License.