Yes it was <20. That was my guess given how inefficient their inference is. People were downvoting me so I thought maybe the statement was too assertive (because to be fair we don’t know their model size) and relaxed it to <100. Got me more downvotes 🤷♂️.
i would swear that in an earlier version of this message the optimal batch size was estimated to be as large as twenty.
yep, original is still visible on mastodon
Yes it was <20. That was my guess given how inefficient their inference is. People were downvoting me so I thought maybe the statement was too assertive (because to be fair we don’t know their model size) and relaxed it to <100. Got me more downvotes 🤷♂️.