I've been looking around for discrete, specific, and practical answers to the question "how many inputs can a (N)AND/(N)OR gate have?" as it relates to ASIC/VLSI/MOSFET/semiconductor technology. The only answers I've found are the theoretical "as many as you want/are willing to pay for" and the vague "it depends on your technology." I'm aware of this.
What I'm looking for is essentially: If I wanted an ASIC manufactured with a combinational logic circuit containing wide/high fan-in logic gates, what multi-input gates are available for me to build this out of? Let's say my circuit needs a 64-input OR gate. Sure, I could make a binary tree of 2-input OR gates, but could I do better? If a 4-input OR gate built directly as a multi-input gate is cheaper and/or faster than a tree of three 2-input OR gates, then it would be better for me to use those as a base instead. For instance, if a standard 2-input OR gate requires 6 CMOS transistors, the tree would require 18. If a 4-input OR gate can be built using less than 18 transistors, then I would prefer to use that.
The type of answer I'm looking for is something like this: Fab X can build a 4-input OR gate with 12 CMOS transistors. Here's a link to a specification document or whitepaper verifying that this is possible using their 28 nm technology.
I'd also be grateful for tips on what to look for or where to find answers. I've scoured this site (and similar forums), IEEEXplore, Google (Scholar), and just about any other venue I can think of for anything related to "high fan-in logic gates" and similar keywords but have come up essentially empty.
I'll also note that I'm looking at micro/nanoscale semiconductor technologies in particular, e.g., MOSFET rather than (for lack of a better description) breadboard-style components, so answers like "this exists!" won't really help me (that is, unless I have a basic misunderstanding around what scale these things can be made for).
The reason why I'm asking this is because I need to know what number of inputs I can practically and realistically have on a logic gate (specifically OR, but I'm also curious about (N)AND/NOR). From there I can analyze the complexity and critical path of larger OR gates constructed out of trees of this unit, similar to the example I provided above. As for why I need to do this, the best answer I can provide is that it's for a research project, and I thought it would be worth asking to see if someone here could point me in the right direction to finding this information. I'm not asking for someone to solve my base research problem, nor will I disclose what it is. Other than what's mentioned above, I don't have any particular special requirements.
For additional context, I'm familiar with FPGA technology, but not nearly as much with ASIC/VLSI/MOSFET etc. I know that I can write HDL to generate an OR gate with any number of inputs, and the results are limited by the physical LUTs on the FPGA. What I want to know is how this translates to non-programmable logic/ASIC.