If I wanted to build a pack out of 20 of 3p4s modules configured in parallel (2 modules paralleled for 2p10s)
It appears the plan to is to have 10 sets in series each consisting of two modules in parallel, each module being 4s3p. That's 10s(2p(4s3p)), or 240 cells arranged in 40s overall. A problem is that it is not 6p: the groups of 3 parallel cells in each modules are not paralleled with another 3-cell group in the adjacent module - only the end terminals of the modules are connected in parallel.
... I understand the BMS would be monitoring/balancing each pair of modules, so it can't balance the cells in the module. Would I want to first balance the cells independently of the final setup?
The BMS doesn't really handle modules - it handles cell groups. There would be 80 3-cell groups for the BMS to handle, and it's going to try to balance all of them to the same voltage per cell group; almost accidentally, the means approximately balancing every module voltage, without regard to which module is wired to which one.
... or am I being dumb here, and the act of paralleling the cells causes the charge to flow between them and become balanced anyway.
You're not planning to parallel cells at all; you're planning to parallel modules. If you just connect two modules at slightly different states of charge, they will equalize in overall charge, but if the cell groups within them are not already matched, they will remain mis-matched.
One other issue with the original description: the modules from the Jaguar I-Pace are not 3p4s (or 4S3P... there is unfortunately no consistency in this notation), they have 12 cells, but they have groups of 4 cells in parallel, three of those groups in series for 3S4P; that's why they have a nominal voltage of about 11 V (3 * 3.75 V = 11.25 V). Put those modules in parallel pairs and ten of those pairs in series and you have 10S(2P(3S4P)), or 30S overall for a nominal voltage of 133 volts. I assume from this:
I'm targeting 110-120v using the LG iPace modules, 11v nominal per module.
...trying to hit in the neighborhood of 40kwh.
... that the actual plan is for 30S overall.
I understand that you need to work with the modules which are available, but if what you want is 30S8P overall (for about 113 V nominal and 50 kWh) it would be a lot more straightforward to use an appropriate number of modules simply connected in series; unfortunately, this "VDA 355" size of module (which always has 12 cells) only seems to be available in 3S4P, 4S3P, and 6S2P... not the 2S6P that would better suit this low-voltage application (e.g. 15 modules in series for 30S6P: 113 V nominal and 37 kWh). No one currently offers this 2S6P VDA 355 module configuration because when strung together to reach the voltage used in modern hybrids and EVs it would be a huge battery; 48 of them would be 360 V 120 kWh, and for this overall size manufacturers want fewer modules for less complication.