[prim_ram] Rearrange parity bit packing and fix wrong wmask settings

This ensures that 1) memories not using per-bit or per-byte wmask
have the correct DataBitsPerMask setting, and 2) that memories with byte
parity employ the correct data + parity bit packing order such that
these memories can be efficiently mapped onto FPGA block rams.

This fix reduces BRAM utilization on NexysVideo from ~80% to ~65%.

Signed-off-by: Michael Schaffner <msf@opentitan.org>
diff --git a/hw/ip/prim/rtl/prim_ram_2p_adv.sv b/hw/ip/prim/rtl/prim_ram_2p_adv.sv
index c6b4669..278a7a8 100644
--- a/hw/ip/prim/rtl/prim_ram_2p_adv.sv
+++ b/hw/ip/prim/rtl/prim_ram_2p_adv.sv
@@ -5,8 +5,8 @@
 // Dual-Port SRAM Wrapper
 //
 // Supported configurations:
-// - ECC for 32b wide memories with no write mask
-//   (Width == 32 && DataBitsPerMask == 32).
+// - ECC for 32b and 64b wide memories with no write mask
+//   (Width == 32 or Width == 64, DataBitsPerMask is ignored).
 // - Byte parity if Width is a multiple of 8 bit and write masks have Byte
 //   granularity (DataBitsPerMask == 8).
 //