• 17 Posts
  • 87 Comments
Joined 5 years ago
cake
Cake day: June 30th, 2020

help-circle



  • It really depends on how you quantize the model and the K/V cache as well. This is a useful calculator. https://smcleod.net/vram-estimator/ I can comfortably fit most 32b models quantized to 4-bit (usually KVM or IQ4XS) on my 3090’s 24 GB of VRAM with a reasonable context size. If you’re going to be needing a much larger context window to input large documents etc then you’d need to go smaller with the model size (14b, 27b etc) or get a multi GPU set up or something with unified memory and a lot of ram (like the Mac Minis others are mentioning).





















  • The US government’s position on this can be summed up as “massive unaccountable US tech firms having all of your data and manipulating public opinion via their black box algorithms is okay, but Chinese companies doing that is a national security concern”. I call BS. The degree to which China is actually a US adversary is being massively overstated by the US government as they see this as a threat to US geopolitical hegemony and America’s ability to propagandize its own citizens. I have spent some time on RedNote (Xiaohongshu) and all I have seen is friendly cross-cultural exchange and discussion between these supposed ‘adversaries’.