I really can’t answer such a vague question. In general, you should build a fully normalized schema, and then denormalize IF performance is a problem AND you can prove that this is needed to enhance performance.
Everything that matters. What kind of data? What exactly? You have all kinds of unqualified terms in there like “very large” and “frequent.” In some cases one way is faster, in some cases another is. Don’t answer with details – I don’t want to know. It doesn’t matter. Start with a normalized schema, populate it with realistic test data, run realistic test queries against it, and measure what happens.