An environment is called fully observable if what your agent can sense at any point in time is completely sufficient to make the optimal decision. So, for example, in many card games, when all the cards are on the table, the momentary site of all those cards is sufficient to make the optimal choice. That contrasts with some other environments where you need memory on the side of the agent to make the best possible decision. For example, in the game of poker, the cards aren’t openly on the table, and memorizing past moves will help you make a better decision.
To fully understand the difference, consider the interaction of an agent with the environment to its sensors and its actuators, and this interaction takes place over many cycles, often called the perception-action cycle. For many environments, it’s convenient to assume that the environment has some sort of internal state. For example, in a card game where the cards are not openly on the table, the state might pertain to the cards in your hand.
An environment is fully observable if the sensors can always see the entire state of the environment. It’s partially observable if the sensors can only see a fraction of the state, yet memorizing past measurements gives us additional information of the state that is not readily observable right now. So, any game, for example, where past moves have information about what might be in a person’s hand, those games are partially observable, and they require different treatment. Very often agents that deal with partially observable environments need to acquire internal memory to understand what the state of the environment is.