淺談c/c++的VLA

發表於 2023-04-29

在C++傳遞二維VLA的難題

在寫某題leetcode時，碰到了需要宣告2維陣列，且陣列大小是變數的狀況。因為自己本身習慣使用VLA(variable-length arrays)，所以理所當然的宣告了二維陣列，並將他丟入函式處理，格式如下

void foo(int y_axis, bool map[][y_axis]){
    ...
}

bool map[x_axis][y_axis];
foo(y_axis, map)

值得注意的是，如果把argument map放在y_axis前面的話，會造成編譯器先看到map然後無法解讀y_axis (未宣告的變數)

這個方法在C中行得通，然而在C++會跳出錯誤訊息:

1 2	error: use of parameter outside function body before ‘]’ token 3 \| void foo(int n, int array[][n]){

若是把map改成bool**作為參數，則會:

void foo(int y_axis, bool** map){

error: cannot convert ‘int (*)[n]’ to ‘int**’
   11 |     foo(n, array);

但如果陣列的大小是常數，卻可以編譯通過

#include <iostream>

void foo(int n, int (*array)[3]){
    return;
}

int main(void){
    int m, n;
    std::cin>>m>>n;
    int array[2][3];
    foo(n, array);


}

看來問題出在C++中，多維陣列的local宣告不可用變數作為參數(VLA)

嗯? 你說你在C++可以用int array[m][n];宣告成功並使用，只是不能丟入函數? 這其實只是編譯器給你行方便而已，GPT給出了回答:

C++ does not support variable-length arrays (VLA) as a language feature, unlike C99 which introduced VLAs as a standard feature. However, some C++ compilers may offer support for VLAs as an extension, allowing you to declare and use them in your code.

The ability to declare a VLA 2D array like int array[m][n] in C++ may be possible because some C++ compilers may allow it as an extension to the language. However, the behavior of such extensions can vary between different compilers, and they may not be portable across different platforms or architectures.

It's important to note that relying on such extensions may not be the best approach for writing portable and maintainable C++ code. Instead, it's recommended to use alternative approaches for dynamically allocating memory, such as using the std::vector container or dynamic memory allocation with the new and delete operators. These approaches provide more portable and flexible solutions for dynamically allocating memory in C++.

解決方法

C++相對於把動態變換大小的資料存放在stack，更鼓勵將資料放在heap中，方法是透過vector STL或是new pointer:

C++中vector元素的存放位置是heap，push_back元素時實際上new了一個新object

// 用new的
int** a = new int*[x_axis];
for(int i = 0; i < x_axis; ++i)
    a[i] = new int[y_axis];


// 用vector，因為vector是動態延展大小，所以這裡包含了輸入
vector< vector<int> > a;
for(int i=0;i<x_axis;i++){
    vector<int> newRow;
    for(int j=0;j<y_axis;j++){
        int input;
        cin>>input;
        newRow.push_back(input);
    }
    a.push_back(newRow);
}

為什麼會這樣

這裡給出了解釋，大致是說如果直接宣告的話，第一可能有stack overflow的風險(將stack allocation交由輸入者決定是危險的)，第二則是對於編譯器實作會有額外的負擔，必須在run time才能知道stack要被allocate多少大小

當然，這也有一定的trade-off，如果我們將資料放在heap中，相比於記憶體地址連續的stack，或多或少會傷害到cache performance(當然還是要看CPU的實作)，且要allocate in stack只要延伸sp指的位置就好，不像new還要跟記憶體互動

不過，傳統的C99是允許VLA的，現今常見的編譯器(gcc, clang等)都有支援基本的VLA，但要注意的是這不是C++的standard，而是來自於編譯器的extention，將C++的宣告方法轉成C風格、支援VLA的格式

結論

如果要用C++寫程式，就善用STL容器吧。

C++與C對於相同的概念，有不同的實作方式：C++希望設計師使用他們提供的解方(STL)來解決問題；C則更相信程式設計師知道自己在幹嘛(有stack overflow的風險)，但提供自由與看似方便的宣告方法的背後，就是程式莫名崩潰的風險。

別被C允許VLA的開發方式慣壞了，為了讓程式更安全並兼顧速度，根據使用狀況判斷使用heap還是stack來存放資料是好習慣