Search

ValueError: Found input variables with inconsistent numbers of samples

The following code is written in python to run the liner regression algorithm on a given set of data. Two columns were chosen, namely X1 and and Y1 were chosen on which linear regression was to be performed. The code used for the same was



On executing the above code the following error was encountered.



The above error generally comes when the X and Y have different number of samples. But in this case the error appeared even though the number of samples in X_train and Y_train were same as shown in the output.



The problem with the code is that we converted the given data into numpy arrays, but it was required to convert them to numpy matrix to be able to pass it to fit. Thus we modify the reading of data to



While doing the train test split we will need to transpose the matrix



Passing these test data to the fit function should not throw the value error encountered previously. The modified code thus would be


3 comments:

  1. Useful article, thank you for sharing the article!!!

    Website bloggiaidap247.com và website blogcothebanchuabiet.com giúp bạn giải đáp mọi thắc mắc.

    ReplyDelete
  2. I am really impressed your written a blog. Hope we are eagerly waiting for such post from your side. HATS OFF for the valuable information shared!
    Linux Training in Electronic City

    Reply

    ReplyDelete
  3. I faced a similar problem while fitting a regression model . The problem in my case was, Number of rows in X was not equal to number of rows in y. In most case, x as your feature parameter and y as your predictor. But your feature parameter should not be 1D. So check the shape of x and if it is 1D, then convert it from 1D to 2D.

    x.reshape(-1,1)

    Also, you likely get problems because you remove rows containing nulls in X_train and y_train independent of each other. y_train probably has few, or no nulls and X_train probably has some. So when you remove a row in X_train and the same row is not removed in y_train it will cause your data to be unsynced and have different lenghts. Instead you should remove nulls before you separate X and y.

    ReplyDelete