So we’ve talked about persepctive so far. I’m currently experimenting but I’m having some trouble still.
Let’s say I use an orthographic perspective and I want to rotate a cube in this perspective, how would I manage this?
So I took some time to really analyze this.
Suppose I want to rotate the rectangle on the “x” axis.
Would it look like this?

The second half of the animation looks right to me but not the first part, why is that?
Not sure.. They're both incorrect in the way stevenblanc specified.
The more or less constant amount of movement per frame you have appears to be correct. However, you are reducing the object's volume during the rotation, which makes it seem like it's not a cube.
I'd suggest you study this, particularly how little the visible profile changes during rotation:

Here's a diagram showing the limits of the profile:

It indicates clearly that the profile never becomes smaller than the straight-on size of one face, and that the profile when straight on is 70% of the size when rotated by 45 degrees (the largest the profile can become). Because we are using an orthographic projection, the object does not become larger or smaller according to it's closeness to the camera.
Since there’s no vanishing point, it makes it more difficult to determine how the rectangle would rotate.
Since there is no vanishing point, you can take your square and rotate it by constant amounts. Then you can just measure the coverage of the two visible faces. My diagram above illustrates this: when rotated by 45 degrees, the space occupied is 43px and half of this goes to the 'top' face, half to the 'bottom' face.
In orthographic, these ratios are also reversible: if you have a frame (22.5 degrees) where your top face has 13px and the bottom one has 28, then a later frame (67.5 degrees) will have your top face with 28px and the bottom with 13px. In this manner you can construct only 45 degrees of your rotation and get the other 45 degrees 'free'.
rotating y, that's mostly just a 2d transform:

And heck, it's easy enough to do, may as well do Z:

The same set of ratios apply to all of them AFAICS. I would suggest that drawing 3 accurate frames: 0 degrees, 22.5 degrees, and 45 degrees, is enough to extrapolate any further frames correctly.